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STUDIES IN MEASUREMENT OF THE RELATIONS 
AMONG SOVEREIGN STATES * 


FRANK L. KLINGBERG 
JAMES MILLIKIN UNIVERSITY 


This article describes the application of three psychometric 
methods to the problem of measurement of the friendly or hostile 
relations among states of the world today. To secure judgments, 
schedules were sent to students of international affairs at several 
times during the last five years. The method of equal-appearing 
intervals was used to determine the relative probability of war for 
88 pairs of states in January, 1937; the method of “triadic combina- 
tions” to determine relative friendliness among the Great Powers 
in November, 1938; and the method of “multidimensional” or group 
rank order to measure the attitudes of important states toward the 
Great Powers in March and April, 1939; June, 1940; and June, 
1941. A chart of scale values for the pairs of Great Powers shows 
the changing trends since 1937. The last two methods were used to 
depict the Great Powers in multidimensional space according to their 


mutual friendliness, thus permitting the application of a type of fac- 
tor analysis. The reliability of the methods employed was high, and 
various types of evidence support the general validity of the results. 


In our efforts to understand or control the development of friendly 
or hostile relations among states, we need to devise and utilize methods 
of “measuring” these relations. Such measurements would enable us 
to be more certain of the trends in international affairs and of the 
effects of significant events. This article describes the application of 
three psychometric methods to the problem. 

Historians have made many estimates of the degree of friendli- 
ness or hostility among states during some periods of the past, al- 
though these cannot be considered measurements. For more immediate 
developments we would like to be able to measure inter-state relations 
of the present. Then, over a period of time, the trends could be studied 
and significant factors evaluated. 

We should like to measure the “psychological distance” between 
any two states at a given time, in terms of more or less friendliness, 
or more or less probability of war. It is natural to speak of friendly 
entities as being close together—that is, separated by a short “dis- 
tance”—and of hostile entities as far apart or “distant.” This “psy- 
chological distance” between states is the resultant, at least in part, 
of the attitudes of the states toward each other. By the “attitude” of 


* The writer is indebted to Professors Quincy Wright, and L. L. Thurstone 
and Dr. M. W. Richardson, for advice at various stages in these studies. 
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a state we mean the “attitude” of the politically effective part of the 
population. This may mean primarily the attitude of the government 
(as in a dictatorship), or the attitude or opinion of a large part of the 
electorate (as in a democracy). In general, we would expect the “at- 
titude” of a state, as expressed largely in official acts and utterances, 
to represent the desires and interests of the dominant groups in the 
state, and, to a lesser extent, the desires and interests of those who 
are in a position to exert pressure upon the dominant groups. 

What indices should be used in estimating the relative degree of 
friendliness existing between any two states? It is natural to turn to 
the methods recently developed by psychologists for the measurement 
of attitudes. The method of equal-appearing intervals has been applied 
to the measurement of “press attitudes” — attitudes of important 
newspapers toward certain nations over a period of time (1). The 
attitude of the “man in the street” (in the United States and Great 
Britain) toward various nations has been systematically sampled, 
notably in the American Institute of Public Opinion, directed by Dr. 
George Gallup. Another type of index is found in the official acts of a 
state—as treaties, notes, speeches, declarations of war, invasions, and 
the like. 

All of these indices are important, yet all of them are weakened 
as guides by the fact that it is sometimes difficult to ascertain the rela- 
tive importance of any one index in determining the attitude of a 
state. For example, how much weight should be given to the opinions 
expressed in editorials of the New York Times or the Chicago Tribune 
in an index of American attitudes? How far does the opinion of the 
electorate as a whole, as indicated in a Gallup poll, represent the 
opinion of the “politically effective” part of the population in foreign 
policy? Does a trade treaty, or an increase in trade, indicate more 
friendliness between two states? How much weight should be given 
to a speech by Hitler or Petain or Roosevelt? Did the invasion of 
Bessarabia by Russia indicate more or less friendliness between 
Russia and Germany, or Russia and Great Britain? On all questions 
of this type, judgments are continually being made by students of 
international affairs. This at once suggests another important source 
of information as to national attitudes: the knowledge of those stu- 
dents who have specialized in international affairs. This was the 
source—expert opinion, we may say—which was utilized in the 
present study. 

These students are qualified to make judgments involving “more 
or less” friendliness or hostility between states. The results of these 
judgments from a sample of the “universe” of experts can then be 
scaled (2), with the expectation that this composite judgment will be 
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more trustworthy than the opinions of a few scattered experts. There 
is evidence to support the belief that the validity of judgments is 
partly a function of the number of judges (3). 

In the methods used in this study, it will be noted that there was 
no attempt to measure the attitude of a group of experts toward cer- 
tain states, but, instead, only the experts’ opinion as to the attitudes of 
various states toward one another. This may make the judgments 
more reliable. For example, American communists, “democrats,” and 
fascists might agree that the United States was more friendly to 
Russia than to Germany, on a given date, in spite of the wide variation 
in their personal attitudes. Some studies have shown that biased 
opinions of the judges do not markedly influence judgments of this 
sort (4). There is as yet insufficient evidence, however, to permit sure 
conclusions as to the possible effect, on judgments concerning the 
attitudes of states, of the personal attitudes of the judges toward the 
states in question, or of the general impression the judges may have 
formed of the states (a possible “halo” effect), or of the influence of 
the form of presentation of the states on the schedules for the judges. 
But the high reliability of the judgments in the present studies leads 
one to minimize the importance of these factors. 

Dr. L. L. Thurstone and others have applied to the measurement 
of attitudes the methods of equal-appearing intervals, paired com- 
parisons, and rank order (5). These methods permit the representa- 
tion of judgments of magnitude of a series of stimuli on a linear 
scale. More recently, Dr. M. W. Richardson has begun to develop 
methods for the representation of attitudes in more than one dimen- 
sion, since there is a high degree of probability that many attitudes 
are so complex that more than one dimension is required for adequate 
description. (6) 


Methods 


Three psychometric methods were employed in the present 
studies: (1) the method of equal-appearing intervals, in a schedule 
on probability of war, in January, 1937; (2) the method of “triadic 
combinations,” in a schedule on the relative friendliness among the 
seven Great Powers, in November, 1938; and (3) the method of “mul- 
tidimensional rank order,” in a schedule on the attitudes of twenty-one 
states toward the seven Great Powers, in March and April, 1939, in 
June, 1940 (before the Franco-German armistice), and in June, 1941 
(before the German-Russian war). The chief object was the develop- 
ment of the most reliable and useful types of schedules for the meas- 
urement of expert opinion on inter-state relations, with a view to the 
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possible application of the procedure at regular intervals for the de- 
termination of trends in international attitudes. It was expected, too, 
that the results would give valuable indications of the relations and 
trends among various states for the period studied. 

A brief discussion of the procedures involved will be followed by 
conclusions as to the reliability and validity of the methods. 

The Method of Equal-Appearing Intervals.—The method of equal- 
appearing intervals was used in January, 1937, to measure opinion as 
to the relative probability of war for eighty-eight pairs of states, in- 
cluding all the pairs of the seven Great Powers. The experiment was 
begun with the 325 possible pairs of 26 important and representative 
states of the world. When these pairs were classified by four experts 
into eleven groups, on a scale of war probability from 0 to 10, a very 
large number were put in the groups labeled 0, 1, and 2. By eliminat- 
ing most of these, the number of pairs was reduced to 88 for the final 
schedule. This schedule was sent to 220 outstanding students of inter- 
national affairs in the United States, Canada, and Europe, with in- 
structions to put a figure from 0 to 10 before each pair of states to 
indicate their opinion of the chance that “war will exist between 
them within the next ten years.” Schedules were filled out by 83 of 
these scholars. The judges were to put 0 if they thought war was 
practically impossible, 10 if they thought it was practically certain, 
and 5 if they thought the chances 50-50. They were asked to give a 
rating as objective as possible of the relative probability of war be- 
tween the pairs of states, “under present conditions.” The phrase 
“within the next ten years” was used so that the judges would have 
long-time as well as short-time trends in mind. It was stated that the 
word “war” was used “in the ordinary sense of the term to mean 
military operations on a large scale designed to compel submission of 
the opposing government.” 

Scale values and Q-values (interquartile range) were calculated 
according to the procedure outlined by Thurstone and Chave in their 
study of attitudes toward the church (7). Relative scale values for 
the 21 pairs of Great Powers are shown on the first scale of the chart 
(Fig. 1). The scaled results were used not only to represent the 
judges’ expectation as to war (that is, as an attempt at group pre- 
diction), but more especially as an estimate of the relative degree of 
hostility existing among the Great Powers in January, 1937. It should 
be noted, however, that “probability of war’ would seem, especially 
for the smaller states, to be dependent upon other factors than hos- 
tility or friendliness—factors such as geographical position, close- 
ness of contact, and relative power. 

The Method of Triadic Combinations.—Dr. M. W. Richardson’s 
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method of “triadic combinations,” which he first applied successfully 
to a two-dimensional case of color perception, was applied to a con- 
cept less complex than “probability of war”—namely, to the relative 
friendliness or hostility existing at the time (November, 1938), be- 
tween the Great Powers. Furthermore, the task given the judges was 
easier. The stimuli (in this case, the seven Great Powers) were pre- 
sented in all the possible combinations of three (that is, in triads), 
with the states placed at the apices of equilateral triangles. It was 
to be expected that the different groupings of the seven stimuli in 
the 35 triads would bring the various factors operative in the inter- 
national situation to the attention of the judges. The judge was re- 
quested to indicate by the letter H the two most hostile (or least 
friendly) powers in each triad, and by the letter F the two most 
friendly (or least hostile) powers. The schedules were filled out by 
144 students and faculty members of the University of Chicago. There 
were 60 undergraduates, 35 graduates in international relations, and 
49 faculty members. Less than half of these can be classed as ‘“ex- 
perts,” yet all the judges were doubtless quite familiar with the rela- 
tionships among the seven Great Powers. 

Scale values were determined for the 21 pairs of Great Powers 
by Thurstone’s “Law of Comparative Judgment,” and also by the 
method of average proportions, with approximately the same results 
(8). The scale values according to the method of average proportions 
are shown on the second scale of Fig. 1. Because of its simplicity, the 
method of average proportions was also used in determining scale 
values for the rank order method. As applied, it consisted in finding 
the average proportion of the judges who regarded any pair of states 
as more hostile than all the other pairs with which it was directly 
compared, and then changing this average proportion to a sigma- 
value or X-value (the standard-deviation score on the X-scale of the 
normal probability curve). 

Not only were the pairs of states placed on a linear continuum 
in order of their relative friendliness, but the “psychological dis- 
tances” (in terms of amount of hostility), between each state and 
every other state used as a stimulus, were determined, so that the 
states could be represented as points in multidimensional space. The 
smallest number of dimensions necessary to construct the points, with 
all the mutual distances correct, should be the smallest number of 
factors necessary to explain the configuration, and it should be pos- 
sible to give these factors (or dimensions) meaningful names. A 
three-dimensional wax model of the states was constructed, with all 
the inter-state distances approximately correct. The procedure is 
outlined on pages 16ff: (photograph in Fig. 2). 
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The Method of Multidimensional Rank Order.—One of the dis- 
advantages of the method of triadic combinations is that only a small 
number of stimuli can be conveniently used. The same difficulty ap- 
plies to the standard method of paired comparisons, in which every 
stimulus is compared with every other one, in pairs. It was desired 
to develop a schedule with most of the advantages of these two meth- 
ods, but without the drawback of limiting so greatly the number of 
stimuli. Some form of the method of rank order seemed to be the 
answer to the problem (9). 

It was desired to use more states than just the seven Great Pow- 
ers. But there are twenty-one pairs of Great Powers, and a group of 
graduate students in international relations agreed that it was quite 
difficult to rank these twenty-one pairs in order of the degree of 
friendliness or hostility. It is much easier to rank a smaller number 
of stimuli, and the results are correspondingly more reliable. So, for 
the Great Powers, it was decided to have the ranking done in groups, 
so that only six Great Powers would be ranked, in order of the friend- 
liness of the seventh Great Power toward them. This method was 
somewhat similar to the method of triadic combinations, since the 
stimuli were presented in different groups so that different factors 
might be emphasized in each group, though perhaps to a lesser extent; 
and it had the advantages of furnishing twice as many judgments 
from each judge and of causing the judge to keep in mind some pic- 
ture of the general international situation (since all the Great Pow- 
ers were in each group). Furthermore, the rankings could be made 
quite quickly. 

Two other advantages were secured. In the study on the prob- 
ability of war (method of equal-appearing intervals), one of the im- 
portant factors noted was the problem of “polarization,” or the ten- 
dency of states to be drawn into a war begun by other states or into 
an alliance. It was desired to secure the composite opinion of a large 
number of students of international affairs as to the relative friend- 
liness of several important small states toward the Great Powers. So 
the judges were asked to rank the Great Powers in order of the friend- 
liness of fourteen small states toward them. 

The other advantage of this group-method of rank order was the 
direct effort made to discover the “attitude” of each state toward the 
Great Powers, according to men who were in a better position to 
know than most others. Thus the aim was to measure the relative 
friendliness of France, for example, toward the six other Great Pow- 
ers, or of Yugoslavia toward the seven Great Powers. In the method 
of triadic combinations (as well as in the method of equal-appearing 
intervals), the judges were definitely comparing “psychological dis- 








FRANK L. KLINGBERG 341 


tances.” Thus the pair France-Germany was compared with other 
pairs in these methods. But in the rank order method, the judges 
gave a relative rating of France’s attitude toward Germany and also 
of Germany’s attitude toward France—a comparison of “vectors,” 
we might say. It was to be expected that a number of these attitudes 
would not be equal to their “reverse” attitudes. 

Six hundred schedules were mailed, on March 1, 1939, to stu- 
dents of international affairs in the United States and Canada. Tabu- 
lations were made separately for the 193 schedules filled in before 
March 14 (when Germany occupied Bohemia and Moravia), and the 
48 schedules filled in from March 14 to April 22 (to be referred to as 
the judgments for April, 1939). Scale values were determined for 
the pairs of states by the method of average proportions, using the 
average of the attitudes of both states in each pair (for the scale val- 
ues for the pairs of Great Powers, see the third and fourth scales in 
Fig. 1). As in the previous method, a model in three dimensions was 
constructed for the Great Powers. 

Since the results obtained by the rank-order method were more 
useful and perhaps more reliable than those obtained by the other 
methods, it was used again when new studies were made in 1940 and 
1941. Members of the faculty of the University of Chicago filled out 
24 schedules in June, 1940, and 40 schedules in June, 1941 (see the 
last two scales in Fig. 1). 


Reliability 

How reliable were the scale values obtained from these five stud- 
ies? In other words, could we expect the results from other similar 
groups of judges, making judgments on the same dates, to give about 
the same scale values? 

Dividing the sixty-five schedules used in January, 1937 into two 
arbitrary groups, it was found that the average difference between 
the ratings was only 0.30 of a scale unit on a scale twelve units long 
(from —1.3 to 10.7). The average probable error for the scale values 
was 0.27 of a scale unit (a little over two per cent of the range of the 
scale). This was the most difficult schedule given to the judges. 

For the November, 1938, method of triadic combinations, tabu- 
lations were made separately for three groups of judges: undergradu- 
ates, graduates, and faculty members. It was found that there were 
practically no significant differences between the judgments of these 
three groups.* 


* As mentioned above, it can be presumed that the three groups were all 
familiar with the relations existing among the seven Great Powers after “Mu- 
nich,” in November, 1988. No small states were included in the schedule. 








342 PSYCHOMETRIKA 


To determine the reliability of the judgments on the rank-order 
schedule of March , 1939, the schedules were again tabulated in two 
arbitrary groups. The average difference between the scale values 
for the two groups was 1.8 per cent for the Great Powers, and 2.6 
per cent for the small states. A direct measure of the degree of agree- 
ment in the rankings of the judges was secured by calculating the 
dispersion and rank correlation of all the judgments. The rank cor- 
relation was high for all Great Powers (as 0.94 for France) except 
Great Britain (0.71). Lesser agreement for Great Britain (in March, 
1939) might be interpreted as meaning that Great Britain was de- 
liberately trying to keep her policy from being clear to others (in 
playing the balance of power), or that she was herself uncertain as 
to what her policy was, or possibly that some of the judges were un- 
familiar with her policy. The dispersion was considerably greater for 
some of the small states, notably Mexico (rank correlation, 0.35), 
Czechoslovakia (0.42), Turkey (0.51), and Yugoslavia (0.58). The 
closest agreement was shown for China (0.94), the Netherlands 
(0.83), Spain (0.83), and Belgium (0.81). 

How trustworthy were the judgments of the twenty-four Univer- 
sity of Chicago men in June, 1940? A good indication of high relia- 
bility is given by a comparison of the judgments of twenty-four Uni- 
versity of Chicago men in March, 1939 (including twelve of those 
reporting in June, 1940) with the judgments for all the 193 judges 
in March, 1939. The average difference for the attitudes of the Great 
Powers was only 2 per cent, and for the scale values of the pairs of 
Great Powers only 1.6 per cent. The dispersion of the judgments in 
June, 1940 was somewhat less than in 1939. 

The forty schedules returned in June, 1941 were tabulated in 
two arbitrary groups. Agreement was close between the two groups, 
except in the rankings for France. The average difference in scale 
values for the two groups was 5 per cent (of the total length of the 
scale) ; with France excluded, the average difference was only 3 per 
cent. Since the average difference for the attitude of France toward 
the other states was 12 per cent, the scale values shown for France 
in 1941 cannot be regarded as very stable. 

The general conclusion is that the reliability of these methods 
is high, especially when dealing with the relations among the Great 
Powers, even though the number of judges is relatively small. For 
the most reliable results, however, caution would suggest having one 
hundred judges or more for any one study. 
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Validity 


How valid are the results? That is, how well do they reflect the 
“actual situation?” We do not know what the “actual situation” is, of 
course; but we can note the consistency of various indices used to 
determine friendliness or hostility between states. 

To begin with, the relative positions and trends for the pairs of 
Great Powers from January, 1937 to June, 1941, as shown in the chart 
(Fig. 1), appear quite reasonable. For example, one would expect the 
chart to show a general trend toward more hostility from January, 
1937 to November, 1938, except for the increased friendliness of the 
three democracies, the three members of the anti-Comintern pact, 
and of Germany-Great Britain (during the “appeasement” period). 
Similarly, the increasing hostility of Germany-U.S.A. for the whole 
four years, and the violent shifts in the position of France after June, 
1940, are quite plausible. 

In studying the chart, it should be noted that the last four scales, 
although they show the average of the attitudes of the two states in 
a pair toward each other, do not indicate whether the attitude of each 
state in a pair is changing in the same direction or not. For example, 
the chart shows Germany-U.S.A. as slightly less hostile for the month 
after March 14, 1939; however, the judges regarded the attitude of the 
United States as a little more hostile to Germany, with Germany a 
little less hostile toward the United States. Great Britain and Russia 
make an extreme case: the judges regarded Great Britain as 0.39 of 
a unit more friendly toward Russia from June, 1940 to June, 1941, 
but Russia was considered 0.44 more hostile toward Great Britain. 

A somewhat more objective check on validity is afforded by a 
correlation of the trends shown on the chart with the chief events 
involving any pair of states. All headlined happenings and most other 
significant developments reported by the New York Times were cata- 
logued for the three periods between November, 1938 and June, 1940, 
and classified as indicating more or less friendliness between any two 
states. There were some borderline cases to judge, but in general the 
direction of an event on a friendliness scale was clear. 

The results show a high correlation of the events with the chart 
trend. Out of 168 news events from November 10, 1938 to March 18, 
1939, 180 would ordinarily be interpreted as agreeing with the trends 
of the chart, with only 38 not agreeing. For only one pair, France- 
U.S.A., did the number of events opposite the trend exceed the num- 
ber agreeing with the trend. The chart shows France-U.S.A. as rel- 
atively less friendly, while the four news events tabulated indicate 
friendliness. But the general position of France-U.S.A. on the scales 
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is one of friendliness, so that the lack of correlation may not be so 
significant. 

For the month after March 13, 1939, 72 events “agree” with the 
trends, while 37 do not agree. For only one pair, Germany-U.S.A. 
(0.01, less hostile) is there some doubt, as all nine events are “hos- 
tile.’ Out of 344 events surveyed for the fourteen-month period 
from April 15, 1939 to June 17, 1940, 286 are in accord with the trends 
shown on the chart, and 58 are opposed. The only pair in which the 
events differ from the trend is France-Japan (0.23, less hostile), with 
six friendly and six hostile events. 

The study in January, 1937, was designed to measure the judges’ 
opinions as to the relative probability of war for eighty-eight pairs 
of states. The judgments were made on the basis of conditions exist- 
ing at that time, and many of these conditions have changed radically 
since then. The chief value of the study should be in its systematic 
representation of the degree of war probability for the succeeding 
ten years as seen in January, 1987 by competent observers. Many of 
the most critical situations of the period since January, 1937 were 
forecast. Within six months of January, 1937, war (in the sense de- 
fined in the schedule) broke out between Japan and China (scored 
as the highest pair, 10.7, on a scale from about —1.3 to 10.7). The 
next highest pair was Russia and Japan (9.9): they have carried on 
rather large-scale border hostilities, particularly in August, 1938 (the 
Changkufeng incident) and May-July, 1989. Then came, in order, 
Germany-Russia (9.5), Germany-Czechoslovakia (8.55), and France- 
Germany (8.05). Next on the scale was Germany-Great Britain 
(6.0): although war broke out in September, 1939, the relatively low 
score might possibly be regarded as predicting England’s efforts at 
“appeasement” in 1937 and 1938. The next 23 pairs of states (going 
down to a seale value of 3.6) have all engaged in hostilities in some 





* Fig. 1 shows the scale values for the 21 pairs of Great Powers at the six 
different dates on which they were judged. The first two scales do not measure 
exactly the same thing as do the last four, although the scale values were all cal- 
culated by the method of average proportions. The first scale was designed to 
measure the judges’ estimates of the relative probability of war; the second, the 
relative friendliness or hostility of the pairs of Great Powers; the last four, the 
average of the attitudes of each Great Power in a pair toward the other. Thus 
only the last four scales are strictly comparable. 

The distances between the scales are roughly proportional to the periods of 
time intervening, and the slope of the lines indicates the degree of increasing or 
decreasing friendliness (or hostility). In a general way, positive numbers on the 
scales indicate friendliness, and negative numbers hostility. Lines sloping up- 
ward (drawn broken) show increasing friendliness or decreasing hostility; lines 
sloping downward (drawn solid) show increasing hostility or decreasing friend- 
liness. Each line represents a pair of states. Abbreviations: F—France, G— 
Germany, B—(Great) Britain, I—Italy, J—Japan, R—Russia, U—United States. 
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form or other, with the exception of Portugal-Spain (4.35), Japan- 
U.S.A. (4.3), France-Hungary (4.1), and Germany-Italy (3.8). Of 
the remaining 59 pairs, only five have engaged in hostilities. One can 
conclude that the experts were quite successful in indicating what 
states would go to war, but that there was little correlation between 
the degree of war probability and the length of time until war 
came. (10) It should be repeated that the chief value of such ratings 
lies in their depiction of the relative probability of war for pairs of 
states at a certain time: ratings taken at intervals would doubtless 
show many shifts (similar to those in Fig. 1). 

Further evidence of the validity of the scale postions and trends 
shown on the chart is given by the fact that they agree in general 
with the results of the Gallup polls, Fortune surveys, and studies of 
press attitudes. All of these studies are narrowed in scope, however, 
by dealing with the attitudes of only one or two states. Examples of 
comparable results from the Gallup polls are as follows: increasing 
hostility of the United States toward Germany since 1937; increasing 
friendliness toward Great Britain; drop in the friendliness of Ameri- 
cans toward the Soviet Union after December, 1939; both Britons 
and Americans in Janualy, 1939 would rather have had Russia de- 
feat Germany in a war between the two; in October, 1937, British 
voters liked the United States best (37 per cent), then France (28 
per cent), and Germany (15 per cent); American voters in July, 
1939, favored England (43 per cent) and France (11 per cent) ; there 
was stiffening of American opposition to Japan in the year before 
June, 1941 (11). Press studies have shown that the attitude of three 
great American newspapers became more hostile toward Japan be- 
tween January, 1937 and May, 1938 (1). 

It is, thus, a fair conclusion that the scale positions and trends 
shown on the chart do reflect the “actual situation”; they are not con- 
tradicted by a closer study of events or of other indices. What then 
have we gained from the chart? We can say that we have secured 
scale values which represent accurately the beliefs of American schol- 
ars as to inter-state relationships (although the groups of judges used 
were not necessarily equally typical of the universe of competent 
judges). Moreover, since the results seem valid, we have developed 
another index for determining degrees of inter-state friendship or 
hostility. It seems to be a more inclusive index than any other, since 
we are really measuring the evaluations and conclusions of some of 
the best minds as to the relative significance of all the other indices 
of international relations. The information obtained is clearly only 
a supplement to, not a substitute for, detailed analysis of the rela- 
tions between individual states. The result is a unified picture of 
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relationships which represents the net effect of numerous interna- 
tional forces as summed up by students of the subject. 

The limits of this article prevent a detailed consideration of 
the results secured for the fourteen representative secondary states. 
Ten of the states were European, two American (Mexico and Argen- 
tina), and two Asiatic (China and Turkey, the latter in part Euro- 
pean). Scale values were determined for the attitudes of these states 
toward the seven Great Powers, from March, 1939 to June, 1940. The 
general trends were supported by news events of the periods studied. 
From the March, 1939 scale values for the small states, their atti- 
tudes toward the Great Powers after war began could have been fair- 
ly accurately predicted. 


Multidimensional Analysis 


The methods of triadic combinations and of multidimensional 
rank order permit a form of dimensional (or factor) analysis. Let us 
imagine the seven Great Powers as points in mathematical space. 
From the scale values determined for the twenty-one pairs, we know 
only the differences in distances (in terms of hostility) between the 
points. We need to express these distances in absolute terms in order 
to be able to construct a model of the seven points. A unit must be 
found. Calling the unit (uw) the distance between France and Ger- 
many, the scale values can all be expressed in terms of u by determin- 
ing the differences between the scale value for each pair and the scale 
value for France-Germany. For example, if the X-value for France- 
Germany is 0.9504, and the X-value for France-Great Britain is 
—0.9371, then the value for France-Great Britain in terms of wu is: 
u — 1.8875. 

The next step is to find the value of the unit. The problem is to 
find a value for u which will permit the inter-state distances to be 
constructed in the smallest possible number of dimensions—i.e., there 
are still two unknowns, the unit and the number of dimensions. In 
an effort to secure an approximate value for these unknowns, the in- 
ter-state distances were first constructed with compasses on a plane 
(two dimensions), by substituting various values for u. When it be- 
came clear that two dimensions were insufficient for representing all 
the distances, three dimensions were used with better results. 

The rational solution of the problem was worked out in an ar- 
ticle by Young and Householder (12). One of their theorems was as 
follows: “The dimensionality of a set of points with mutual dis- 
tances d;; is (two) less than the rank of the n—1 square matrix F' 
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....’* A matrix like the one just referred to is made up in this 
study of the squares of the inter-state distances (expressed in terms 
of u), with zeros in the diagonals (since the distance between every 
state and itself is zero) and the whole matrix bordered by 1’s. The 
matrix is here a square table with rows and columns for each of the 
states, so that each cell in the matrix contains the square of the dis- 
tance between two states. Setting such a matrix equal to zero and 
solving for u, we would obtain the value for u which would permit us 
to construct the points in a space of five dimensions or less. ¢ 

Since the solution of the determinant using all seven states would 
have been extremely laborious, it was decided to solve for u by using 
only five of the states at a time, thereby finding a value for uw in three 
dimensions. Using the data from the method of rank order as applied 
in March, 1939, the logical value for u was 3.18 for France, Germany, 
Japan, U.S.A., and U.S.S.R. For France, Germany, Great Britain, 
Italy, and U.S.S.R., u was 3.33. The average of these two figures 
(which included all seven states), or 3.25, was adopted as the final 
value for u. The twenty-one inter-state distances were then expressed 
in terms of this unit. 

In constructing a wax model of the Great Powers (see Fig. 2), 
all the interstate distances fitted into the system, with the exception 
of Japan-Russia. Leaving out this one distance, there was no dis- 
crepancy over 0.15, and the average discrepancy was 0.07 (2.6 per 
cent of the median scale distance). Japan’s distances were consistent 
with all the other states except Russia. It was decided to add a 
second “ball” on the model for Japan (labelled J.), placing it at the 
correct distance from Russia and from its “friends,” Germany and 
Italy. A fourth dimension, impossible to represent geometrically, 
would probably be needed in order to remove the discrepancy between 
Japan and Russia. 

This model is a geometrical representation, in three dimensions, 
of the opinions of 241 competent judges as to the relative friendliness 
of the Great Powers in relation to one another in March, 1939. The 
simplest dimensions of the model should represent the fundamental 
or primary factors which led the experts to judge the relative friendli- 

* The word “two” was inadvertently omitted from the article in Psycho- 
metrika. The symbols i and j represent any two points. 

+ The rank of a matrix is equal to the highest order of the non-vanishing 
minors (see L. L. Thurstone, The Vectors of Mind, Chicago, 1935, pp. 10, 81-85). 
The order of a square matrix equals the number of its rows cr columns, eight in 
this case. If this eighth-order determinant vanishes (i.e., if its value becomes 


zero when the proper value is substituted for uw), then there remain several sev- 
enth-order “minors” which may not vanish and thus give the matrix a rank of 


. seven. The dimensionality is then not more than five, according to the theorem of 


— and Householder—in which the number of dimensions is two less than the 
rank. - 
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ness as they did, and they might represent the three primary factors 
in the international politics of the Great Powers in March, 1939. The 
three dimensions were regarded as three axes perpendicular to one 
another. Each state then has a certain projection on each axis which 
guides us in attempting to interpret the meaning of the axis. Follow- 
ing both the geometrical configuration of the model and certain well- 
known relations among the Great Powers, the three axes were tenta- 
tively drawn as follows: Axis 1— a line joining the mid-points of 
the lines between Germany-Italy and France-Great Britain; Axis 2 — 
a line perpendicular to Axis 1 at its midpoint, going in the direction 
of Russia and Japan (J.); Avis 3 —a line at right angles to both 
Axes 1 and 2. Axes 1 and 2 can be imagined as horizontal and vertical 
lines on the photograph of the model; Axis 3 is at right angles to the 
plane of the paper. The projections of the states on the three axes 


are given in Table 1. 


CHART B 





FIGURE 2 * 
Model of Great Powers, March, 1939 


If the axes were correctly interpreted, we can say that the three 
primary factors involved in March, 1939 were: Axis 1 — “dynamism” 
(national attitudes insistent upon change) ; Axis 2 — “communism” 


* The letters on the photograph of the wax model in Fig. 2 represent the 
Great Powers, as follows: _F--France; G—Germany; B—(Great) Britain; I— 
Italy; J,—Japan in relation to the democratic states, Germany, and Italy; 


J,—Japan in relation to Russia, Germany, and Italy; R—Russia; U—U. S. A. 
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TABLE 1 
Projections of Great Powers on Three Axes of Model 
Axis 1 Axis 2 Axis 3 
State Projec- State Projec- State Projec- 
tion tion tion 
France | —1.80) Japan (2) —1.45] U.S.A. —1.45 
U.S.A. | 1.80) Great Britain —0.60} Italy —0.70 
Great Britain | —1.15} Germany —0.20] Great Britain —0.45 
U.S.S.R. } —0.95) Italy 0.20) Japan (2) —0.05 
Japan (2) 0.40) U.S.A. 0.35) U.S.S.R. 0.00 
Japan (1) 0.80) France 0.60) France 0.40 
Germany 1.45) Japan (1) 1.45] Japan (1) 0.60 
Italy 1.55] U.S.S.R. 2.25), Germany 0.70 

















(opposition to or fear of it) ; Axis 3 — “belligerency” (in the psycho- 
logical sense of willingness and readiness to fight). The model shows 
only the relative position of the states in regard to these factors or 
dimensions; it does not explain why the states are so arranged, of 
course — such an explanation must be preceded by detailed analysis. 
It should be possible to describe any Great Power in terms of its 
location on these “primary factor’ scales. We could say, according to 
the interpretation given, that Italy, for example, wanted very much 
to change the “status quo” in March, 1939, but was quite unwilling 
to risk war with the Great Powers to achieve her aims, while her 
opposition to or fear of communism was quite moderate. At any rate, 
this “multidimensional” experiment suggests a line of approach, by 
some kind of “factor analysis,’”’ which should be fruitful in the study 
of international relations. 


CONCLUSIONS 


This study has been concerned with one method of determining 
inter-state relations and attitudes — through the measurement of the 
opinion of students of international affairs. It is believed that the 
results have proved significant enough that the method should be 
further developed and applied systematically to the measurement of 
national attitudes. Other methods should be employed at the same 
time, as the results would check and supplement one another. 

A program for regular measurement could be worked out by 
having, say, a hundred well-versed and experienced students of inter- 
national affairs (including professors, authors, newspaper correspon- 
dents, diplomatic and consular officials, and the like, from several im- 
portant states if possible) give ratings on the same dates as to the 
relative friendliness of the Great Powers with one another; while 
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the attitudes of the smaller states could best be rated by specialized 
students and trained residents of the states in question. 

An outstanding feature of the chart (Fig. 1) is the sharpness of 
the shifts for some of the trends. Attitudes of states or governments 
can apparently change very quickly. If we had a graph of these 
trends measured at regular intervals (as monthly or bi-monthly), over 
a period of some years, we would be on safer ground in attempting to 
analyze the significant factors in international relationships of the 
present. The prominence of the shifts in attitudes would cause one to 
be wary of using the chart for predictive purposes, although long- 
continued trends (such as the growing hostility of Germany-U.S.A.) 
may be suggestive. The rapidity of these shifts in attitudes also sug- 
gests that it may be discovered that less importance should be attached 
to so-called basic factors — as geographical and economic factors, 
which usually change slowly if at all — than is sometimes supposed. 

Of the various indices for determining inter-state trends and 
positions, the measurement of the composite judgment of students of 
international affairs appears to be one of the best and most efficient. 
The methods can be applied very quickly so as to give up to date 
results, a great number of factors receive weight in the judgments, 
and the attitudes of a large number of states can be measured at the 
same time. 
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HOW “G” CAN DISAPPEAR 
C. SPEARMAN 


In a recent number of this Journal (June, 1941), Professor 
Guilford has interestingly called into question the assertion of several 
authors that Thurstone’s method of analysis (by maximal frequency 
of zero loadings). 


“cannot discover a g factor even when such a factor 
exists.” 


To settle the point, Guilford has submitted to the said analysis a 
fictitious factor matrix which was expressly devised to contain g. The 
result was that this factor — in opposition to the assertion mentioned 
above — did duly make its appearance. Thereupon he fairly enough 
proceeds to conclude that: 


‘“‘a g factor will not necessarily escape the analysis” * 
Not so acceptable, however, is his wide generalization that: 


“the problem is typical enough to enable us to predict 
that a g factor if present will usually be discovered.” 


For I venture to call his attention to my proof that really any g intro- 
duced (when employing the principle of frequent zeroes) either may 
or may not come to expression in the analysis. I showed that the latter 
or negative issue depends on certain influences that, though improper, 
are only too frequent. Notable are small correlations, large sampling 
errors, and multitudinous group factors. * 

In empirical support of this theoretical contention, I quoted 
Thurstone’s own research, in which all these three unfavorable influ- 
ences had been exceptionally dominant. Accordingly, all traces of the 
ordinary g had here — apparently for the first time — vanished, 
whereas in the subsequent work of Wright, all three influences were 


* The italics in the present article are the writer's. 
*Spearman, C. Thurstone’s Work “Reworked.” J. educ. Psychol., 1989, 30, 
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greatly reduced.* And here, concordantly once more, g was forthwith 
restored. 

Now, the present case of Guilford appears to resemble that of 
Wright, in that with both investigators alike the three influences were 
far smaller than with Thurstone. For instance, Guilford introduces 
only two group factors, whereas Thurstone had posited no less than 
twelve. 


* Wright, Ruth E. A factor analysis of the original Stanford-Binet scale. 
Psychometrika, 1939, 4, 209-220. 
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THE EVALUATION OF LINEAR FORMS 


P. S. DWYER 
UNIVERSITY OF MICHIGAN 


This presentation deals with the evaluation and transformation 
of linear forms. Especial emphasis is given to implicit methods in 
which it is not necessary to find the explicit values, x;. The relation 


of the Aitken triple product matrix CA-1B to the result of a linear 
transformation of linear forms is noted, and the numerical computa- 
tion of this triple product matrix is indicated with the use of the 
simple Abbreviated Doolittle solution. Application is also made to 
the evaluation of A-1 and of A-1C. 


1. Introduction. 
It is frequently desired to find the value of a linear form when 


the values of the variables are given implicitly by a set of linear equa- 
tions. Thus we may wish to find the value of the linear form 


Ay; Hy + Msi Xo + Agi Xs, (1) 


i > 3, when the values of x, , x. , 2; are implicitly defined by the equa- 
tions, 4 #0, 


Qy, % + Ay; Xo + Mg, Xz = Ags 
Ayo Ly + Age Xe + Ago Xz = Age (2) 
G13 © + Ags Le + Ass Ly = Ags . 


It is the purpose of this paper to show how the methods and notation 
of earlier papers (1) (2) can be used in evaluating the general linear 
form. Special emphasis is given to implicit solutions which give the 
value of the form without finding the explicit values of the variables 
in the form. , 

In many problems it is desired to evaluate simultaneous linear 
forms. For example, we may wish to evaluate (1) for a series of 
values, 7 , when in each case, the values of x, , %2 , and 2; are implicitly 
defined by the equations (2). Application is made in this paper to a 
number of such situations and illustrations are given. In many of 
these cases it is desirable to translate the particular problem and the 
indicated solution to matrix notation. 


2. The Basic Theory. 
The basic problem is indicated by the three-variable illustration 


above. We can solve for x, , 2,, and 2, in equation (2) by improved 
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methods (1) and then substitute in equation (1) but it is usually pref- 
erable to find the value 


Oyj Hy + Agi Le + Asi Ls = Ay; (3) 
without finding the values of x,, 2, and x;. This means that the 


three equations of (2) with A # 0 and the additional equation (3) 
must be satisfied simultaneously. The condition for this is that 


Ay, Qe: Q31 Ug, 
Ay Ase Qs2 Oe Ea 0. (4) 
Ay Ass Q33 — 3 
Qi4 Qo, Ass gi 








It follows that A = |a;,| = 0 with the minor of a,; # 0. If we use the 
method of single division for evaluating determinants (or any of its 


variations) (2), we get 
A = Qyy Ago.1 Og3.19 AgiaezWith A, Mo2.1 Mg3.12 FO. 
It follows that 
Bs Asi Aya Mein — Ugga2 Asie 
Ai Ao0.1 Q33.12 





Qgi.123 = 0= A, — 


and hence the value of the linear form is 


Og, = Oy; Day + Oei.g Dao + Asi-ne Oas.12 « (5) 
An alternative form is 
Ogi = Og, O45 + Oge.1 Doig + Ogs-re Doi.12 « (6) 


In many practical problems the matrix of the coefficients of (2) 
is symmetric. In such a case A is “almost symmetric” (2, p. 200) and 
abbreviated methods are applicable. In particular the Abbreviated 
Doolittle method is applicable. We add an additional column, aj; = a; , 
Qin = As; , Wiz = M3; and the value of a,; can be written 

Og = Wis Day + Qing Dao + iz.r2 Das.12 (7) 


The method is illustrated in Table 1 where the theoretical presenta- 
tion is given in the left and the actual three-place solution of 


4002, + .600x, + .800x7; = ? with 
.800x, + .480z, + .360x7; = 1.000 
A802, + .800x, + .3607;—= .000 
360%, + .362z, + .860z7,—= .000 


(8) 
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given on the right. 


TABLE 1 
Implicit Evaluation of Linear Form—Abbreviated Doolittle Method 





a,,|@ a a a, 800 480 360 | 1.000 400 


31 41 





-800 360 000 -600 

















Q35 a,, Qis 860 -000 -800 

@,,] 4%, | %, | % | 800 | .480 | .360 | 1.000 | .400 
2 ide Te Bay bi, 1.000 -600 450 1.250 -500 
@yo5| G30.4| Goa | Viva 512 144 | -.600 360 

1 bose) Seas | Sense 1.000 .281 -1.172 -703 








658 | -.281 519 


O33.44 O43 42 43.12 





1.000 | —.427 -789 


43.12 rT 











40 





























a,; -.144 





Application of (7) gives (see the underscored terms of Table 1) 
Q,; = (.400) (1.250) + (.860) (—1.172) + (.519) (—.427) = —.144. 
Application of the alternative formula (6) yields 
G4; = (1.000) (.500) + (—.600) (.703) + (—.281) (.789) = —.144. 
It is possible to use the methods of multiplication and subtraction 
similarly. Thus 





A i-123 
A => =0 80 that A,j.123—=0, 
1, Ave 
0 = { (G4; yy — Oey 04) Ago — Ages Asia} Ass3.12 — Asse Asias , and 
_ U1 Qin Aga Aie1 A43.12 Ais.12 
Ay; — + + . 
4, ay, Aoe.1 Ay, Ago Ass.12 


The Compact solution of the problem above is presented in Table 2. 
The basic theory has been worked out and illustrated in some detail 
for the case of 3 variables. The general development for the case of 
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a larger number of variables is very similar to that outlined above. 
3. The Square of the Multiple Correlation Coefficient. 

One of the most important amplications of the above theory is to 
the problem of determining the multiple correlation coefficient with- 
out finding the #’s. The square of the multiple correlation coefficient 
is a linear form in the #’s. 


17 0.19-6n —= Tor Bor-2...n + 1 o2.Bo2-13-..n + 0s + Teg Mewspees 


where the #’s are implicitly given by the normal equations which have 
the correlations as coefficients. The square of the multiple correlation 
is readily obtained from the forward solution of the normal equations 
without going through the back solution. In fact, it is possible to ob- 
tain a large number of related multiple correlation coefficients 75.12, 
15.193 15-1234 » T4-12 » Ta123, etc. from the same forward solution. 
4, The Alternative to the Back Solution. 

It is customary to find the values of x, , «., and x; in such equa- 


TABLE 3 
Value of x,—Abbreviated Doolittle Method 

















1.0 | A 5 6 2 1.0 0 | 0 0 
ay | 1.0 3 A 4 0 1.0 0 0 

| 1.0 2 6 0 0 | 10 0 
sual | oy ere e 8 0 2 a 10. 








1.0000 | 4000 -5000 -6000 .2000} 1.0000 0000 .0000 0000 


1.0000 .4000 -5000 -6000 -2000} 1.0000 0000 -0000 -0000 





.8400 1000 -1600 3200] -.4000 | 1.0000 .0000 0000 








1.0000 1190 1905 38810} -.4762} 1.1905 -0000 .0000 





-7381 | -.1190 4619} -.4524]} -.1190 | 1.0000 0000 








1.0000 | -.1612 6258] -.6129 | -.1612 | 1.3548 .0000 





5903 -6935| -.5966 | -.2097 -1612} 1.0000 








1.0000 | 1.1748} -1.0107 | -.3552 27381 | 1.6941 











-9364 | -0602 8152 | 1.1748 





























os | Lo @ | % 
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tions as (2) by use of the “back solution.” However 2%, is a special 

case of the linear form with a,;—=1, a; =0, 4; =0. Similarly z, 

is a special case with a,; = 0, ai —1,a,; —0. It is at once possible 

to find x, , x. , or x; by the methods outlined above. In the first 6 col- 

umns of Table 3 there is presented the Abbreviated Doolittle Solution 
for x, in the case of the four-variable problem 

£, + Ads + ba, + O62, =: 

42, + + 3%, + 4uy,=. 

52,+ .3%,+ %¢%+ .2%,= (9) 


62, + AX. + 2; + U4 = 


< GS CGes ae bo 


= 
with 
x, = (1.0000) (.2000) + (—.4000) (.3810) 
+ (—.4524) (.6258) + (—.5966) (1.1748) = —.9364. 


The last three columns of Table 3 give the values of x, , #;, and 
x, similarly. The simultaneous calculation of these four forms pro- 
vides information equivalent to that of the conventional back solu- 
tion. 

The Compact method can be used also. 
5. The Implicit Transformation of Linear Forms. 


If aj; = aj; and |a;;| ~ 0 for7, 7 < 3 and if a transformation is 
expressed implicitly by the equations, 
Gy, Ly + Agy Ly + Az, Ly = Ag, Ly + Ag, Xz + Ags Xe 
Dre Ly + Use Lo + Azo Uz = Ugo Xe + Ass Xs + Age Xe (10) 
Qy3 Ly + Ags Lo + Azz Ly = Ugg Xe H+ Asg Xs + Ags He, 


then it is possible to evaluate @,;%, + Gsi%_ + d3i%, in terms of 2%, 4%, 
and x, without finding the explicit values for x, , x, and x; in terms 
of 7,, x; and x, by the methods of the last section. The details are 
shown in first seven columns of Table 4. The six columns of (10) 
above are followed by the column a;, where a;, = a;;. The Abbrevi- 
ated Doolittle method is applied and the values found from the for- 
mulas 
Ogi = Vis Dg, + Wind Dao + Wiz.re Das.12 


O54 = Ai, Ds, + Aiea Osea + Mis.r2 Ds3.12 (11) 


Agi = Ai De, + Disa Dera + Ais.r2 Des.r2 « 





a a ee ee ee ee 
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The values of other forms such as @,; 2% + Qs; %2 + Qs; %3, Ay, Ly + 
(ox Lo + 3x Xs, etc., can be transformed at the same time. The work is 
shown in Table 4, 


TABLE 4 
Transformation—Abbreviated Doolittle Method 












































ay | Qo, | a3, a4, as, Gey Qi, Qi, Qe, 
Gin | A0 | Ds5 U5 a50 Geo Gio Gio Ay» 
G13 | 3 | a5, a4 as5 as a; G55 as 
a | Mh | a5 a, G55 ay a, aiy &, 
1 bo, bs, ba; bs, bey bi, bj, by, 
G04 A304 V0.4 F504 o.1 Gio ion eo 4 
1 bso4 Bye bso4 Deo bio 1 bin 1 Dro 4 
33 49 43.40 53.12 3.12 i3.12 Qi312 Dez 40 
1 bys 12 bs3.19 bes 12 bisie bis49 bis 19 
a4; as; a; 
| 
a4; as; a; i: 
ay as, : Ger, 























It seems to be wise to state the problem of the transformation of 
linear forms in matrix notation. The implicit transformation (10) 
can be written AX = BY where A and B are known and A is a singu- 
lar symmetric matrix. We wish to find D so that the matrix equation 
CX = DY, with C known, is also satisfied. Then X = A“BY and 
DY = CX = CA“BY , so D= CAB. In Table 4 the solution D is 
given in the last three rows while the values A , B, C are the succes- 
sive matrices of the first three rows. In order to evaluate a triple 
product matrix of the form CAB, with A non-singular and sym- 
metric, it is necessary only to carry the Abbreviated Doolittle method 
through A and make repeated use of formula (11). 

The evaluation of a triple product matrix of this type has been 
shown previously by Aitken who used the “method of pivotal con- 
densation” (3) (4). Aitken used a bordering technique which neces- 
sitates considerable space. He also worked out his technique without 
assuming a symmetric A. Now the abbreviation of the method of 
pivotal condensation is essentially the same as the Abbreviated Doo- 
little method (1) so, in case symmetry is present, it abbreviates to the 
method outlined above. Similarly the essentials of the bordered re- 
sults of the Aitken method are given in the method above. It follows 
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that the methods of this paper may be viewed as abbreviation of the 
Aitken method of evaluating the triple product matrix when A is sym- 
metric. These abbreviations tend to simplify the presentation by 
eliminating unnecessary recording, by making the solution more com- 
pact, and by making the “ritual” easier and simpler if, as is usual in 
statistical work, the matrix A is symmetric. 

Important special cases are discussed and illustrated in the sec- 
tions following. 

6. The Inverse of a Matrix. 

The inverse of a matrix can be obtained by solving the matrix 
equation AX =]. This can be accomplished by the usual forward and 
back solutions. Another method of finding the inverse of a matrix, 
whose origins go back to work by Horst and Waugh (5) (6) and Cure- 
ton (7), has been presented in a previous paper (1). A third method 
is obtained as a special case of the technique of the last section. If 
B=TIandC=1I, then D=A". It follows that it is necessary only to 
place the matrices in the order A , 7,1, and to complete the Abbre- 
viated Doolittle solution. However, it is not necessary to record the 
columns under the second matrix 7 , since they are exact duplicates of 
the columns under the first matrix J and these first columns can be 
used in making the calculations. This elimination necessitates a slight 
adjustment in technique in calculating the final values, but this tech- 
nique is easily learned. The scheme is perhaps most quickly under- 
stood from an illustration. The problem used is that given by Tucker 
(8) to illustrate the use of Aitken’s method in getting the inverse of 
a matrix. Tucker used a variation of the method of single division 

TABLE 5 


The Inverse of a Matrix 
(Tucker Illustration with Abbreviated Doolittle Method) 


























800 | 480 /  .360 | 1.000 .000 000 
sens | 800 | .360 | .000 | 1.000 | .000 
a |__| 860 | .000 | 000 | 1.000 
g00 | 480 | 360 1.000 | .000 | .000 
1.000 | 600 | 450 | 1.250 | 000 | 000 
| 512 | 144 | 600 | 1.000 | —.000 
| 1.000 | 281 | -1.172 | 1.953 | .000 

| | 658 | -281 | -.281 | 1.000 

| 1000 | -427 | —427 | 1.520 

ad . ae “T 2.073 | -1.052 | -.427 
i Veagay 2.073 | -.427 


ree 1.520 


























oo} 
<o 
oD 


P. S. DWYER 


























eTe- 389° 88T° LSZ° 890° $60 = ©O6t= J ike = my 

008"- oct 080°- 0v2"— L983" Let’ 0&0" L60° Bue we oe 

ser" vLT— ott Lv g60= —Get v9s ad amg géTt’- 96T° 

€Te"- 389° 88T" LSS 890° 860°-  06T’- 86a" 160° Tel - 000°T 

TLET — SLL'S ves" 82TT 962° LOv- =vE8"- 000°T Sor" 8T9"- S887 

LOT’- 6hT— 09T- 60e"— 8E3" L8T TEL 000° oot ¢$l0"- 920°- 000°T 

820'T- T26- 066°— 9ST2@- ZsLTS SsSTT 389° 000° 000°T v9r— 929°%- 69T9 

vot’ LYT’ 8st vrs" cee’ 8st =. 86T" 000° 000° vL0° 61h" cov" 000°T 
3603 000°6 StS L99°V GhSb 8 86stTS 86369°S 000° 000° 000°T 989°S 892°9 ELo°ST 
000° 000°7 SPS 000°F 000° 000° 000° 000°T 000° 000° coe) ee 
000° 000° 000° 000° GLOL e8hTS = 8&c6T 000° 000°T 000° 000° $906 rr 
360°C 000°6 SITS L99°V GhSb 8 8=6©shT'S §6(869°S 000° 000° 000°T 989°S 892°9 SLo°ST 





(poyz@W 31100 peyeraerqqy YIM UorzerySNI]] UUeULTEpeT) 


ai-V =X 


9 ATAVI. 














364 PSYCHOMETRIKA 


and the bordered matrix of Aitken while the presentation here utilizes 
the more satisfactory Abbreviated Doolittle method and the equations 
(11). 

The matrix A is followed by J. The Abbreviated Doolittle method 
is carried out. The values of the elements of A“ are obtained from 
the entries of the three double rows above. Thus, 

2.073 = (1.000) (1.250) + (—.600) (—1.172) + (—.281) (—.427) 
—1.052 = (1.000) (.000) + (—.600) (1.953) + (—.281) (—.427) 
ete. 

Since A is symmetric, A“: is symmetric and the symmetric values 

need not be duplicated. 

The reader may wish to examine a recent article by Hoel in which 
certain inverse matrix methods are compared. (9) 

7. The Solution of the Matrix Equation AX =B. 

The method can also be used in obtaining the values of X = A“'B 
since AB is a special case of CA“B with C =I. 

As an illustration we use a problem of Ledermann (10, page 115) 
who used Aitken’s method. The Abbreviated Doolittle solution is 
shown in Table 6. In this case we present the matrix I before the 
matrix B (though it can be inserted after B). The Abbreviated Doo- 
little solution is carried out and the value of A“B computed. In the 
illustration the value of A-' is also indicated, though it is not neces- 
sary to obtain A“ in order to secure AB. 

The compact method can be used also. 

8. Conclusion. 

It has been shown how to evaluate and to transform linear forms 
implicitly with the use of the Abbreviated Doolittle and Compact 
methods. This technique leads to an abbreviated solution of Aitken’s 
triple product matrix. The chief purpose of the paper is to demon- 
strate the relative ease and practicability of the Abbreviated Doolittle 
method in solving the problems discussed. The reader will understand 
this point better if he actually compares the solutions given here with 
the solutions indicated previously by such authors as Thompson (3), 
Aitken (4), Tucker (8), and Ledermann (10). 
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A SIMPLE SCORING WEIGHT FOR TEST ITEMS 
AND ITS RELIABILITY 


J. P. GUILFORD 
UNIVERSITY OF SOUTHERN CALIFORNIA 


It is pointed out that the scoring weights for test items should 
be approximations to regression-equation weights. For this reason 
any estimate of reliability of the weight should not be permitted to 
influence the size of the weight but should be used in determining 
the limit of acceptability of an item. A simple approximation weight 
is recommended for general use, and an abac is provided for the es- 
timation of it when the correlation between item and criterion is 
the phi coefficient. A formula for the standard error of this weight 
is derived and tables of significant and very significant weights are 
presented in terms of deviations from the median weight. 


In the scoring of reactions to items in personality questionnaires 
and interest inventories something may be gained by assigning dif- 
ferential weights to the items. Perhaps the most customary weight 
employed for this purpose has been that computed by means of the 
Cowdery-Kelley formula, 


id ¢ 

° = Fe" vss 
though more recently Kelley has repudiated this formula, proposing 
another to be used in its stead. It is my purpose here to suggest that 
both these formulas can be questioned in principle and to propose a 
simpler one. An abac is provided for graphic computation of weights, 
and also some tables for determining whether weights so derived are 
statistically significant. 

The accepted rationale underlying the combining of tests into a 
battery and the prediction of some criterion score from a single bat- 
tery score leads to a multiple regression equation in which each test 
is weighted according to its regression coefficient. Thus the maximum 
accuracy of prediction is to be attained. The regression weight for 
score X,, when X, is the criterion score, would read in equation form, 


By2.34...n = (=) Biz.84-n 5 (2) 
Te 


in which the symbols are very familiar. The same reasoning should 
apply to the combination of items in a test. Unfortunately, the num- 
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ber of items is usually so great that the labor of computing the beta 
coefficients is considered prohibitive. The beta coefficient in each case 
depends upon the correlation of the item with the criterion as well 
as upon the intercorrelations of the items with one another. In prac- 
tice, we assume that each item’s beta coefficient is approximately pro- 
portional to its raw coefficient of correlation with the criterion, and 
this assumption carries the implication that the effects of intercorre- 
lations améng the items are uniform as they concern the various betas. 
This practical step reduces the regression weight to the form 


b.. = (“) Tx ; (3) 


C2 


if we may contract the subscript of b for convenience. 

Kelley assumes that when dealing with interest-inventory items 
and correlating each one with the criterion variable of presence or ab- 
sence in a given vocation we should assume a point distribution both 
ways (2, p. 506). Under these circumstances the phi coefficient is in- 
dicated and its value is equivalent to the Pearson r (3, p. 259). The 
same reasoning may also apply to other test situations, as for example 
masculinity-femininity inventories. At any rate, what one uses in the 
numerator of equation (3) will be determined by the assumptions he 
makes about his two correlated distributions. If the coefficient phi is 
used, the equation becomes 


bio = Pa (4) 


02 


The chief difference between this and Kelley’s original formula (my 
equation (1)) is his addition of the element (1 — ¢?) in the denomi- 
nator. The o, in the numerator is constant throughout a set of com- 
putations with the same criterion distribution. The new element 
(1 — ¢?) is an approximation to the variance of the phi coefficient, in 
other words, a¢?. 

By bringing this element into the situation Kelley introduced a 
new principle into the customary regression equation. This is the prin- 
ciple that when observations are summated or averaged they should 
be weighted according to their reliabilities, or inversely as the square 
of their variabilities. Kelley’s 1934 formula for the weight includes 
instead of the index of reliability of phi, the index of reliability of 
the regression weight itself. But reliability of the regression weight 
has to do with the accuracy of predictions; it should have nothing to 
do with the size of the contribution of an item to the total score. No- 
where else in test procedures, to the knowledge of the writer, does a 
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regression weight depend upon its reliability for its size, or upon the 
reliability of any of its parameters. The index of reliability should 
be used to tell us within what limits we might expect the size of the 
regression weight to fluctuate with repeated samplings like the one 
we have. It is true that some attention should be paid to the relia- 
bility of a regression weight. The standard error of a regression 
weight tells us the likelihood that in repeated samplings the weight 
might become zero or even reversed in sign. But this information 
leads merely to acceptance or rejection of it as being significant or 
insignificant, respectively. In general practice, we do not augment 
the size of a regression weight when it is found more reliable and de- 
tract from it when it is found less reliable. Regression weights below 
a certain minimum standard, a standard established by reason of our 
knowing the standard errors of those weights, should be rejected en- 
tirely rather than being reduced in size and used. Once having re- 
tained an item or a weight by reason of significant weighting, we 
should not further tamper with the size of the weight by reason of 
the degree of its reliability. 

When our coefficient of correlation to be used in the regression 
weight is phi, the writer has previously shown (1) that it can be con- 
veniently computed by the formula 


Pu — Di 
=——, (5) 
2V pq 





where 


p,is the proportion of the upper (or positive) criterion sub- 
group who respond in the specified manner, 


p,is the proportion of the lower (or negative) criterion sub- 
group who respond in the same manner, 


pis the proportion of the two sub-groups combined who re- 
spond in this manner, 
and 
q=1-p. 
Adapting this way of computing ¢ to equation (4), we have, using 
the fact that o. = Vom, 


(Du tr D1) 0% 
6. ae 6 
x 2 Po Yo (6) 
where 
p. is the proportion of the two sub-groups combined respond- 


ing in the specified manner, 
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and 

Ge—1.— De. 
Now «; is equal to .5, for the reason that the criterion group is divided 
into two equal sub-groups, or the two criterion sub-groups have been 


equalized in importance so that p, = q, —.5. Then o, = Vp,.q, = .5. 
Formula (6) then becomes 
_ 5 (Pu- Pr) _— Pu Di 


b;. = = " 7) 
2; 2 D2 Qe 4 D2 Qs 7) 





This situation gives us a maximum range of weights from —1.0 
to +1.0, which is too limited in scope. In practice we wish a larger 
range composed of small integral weights, for example a range from 
—4.0 to +4.0. Accordingly we may multiply our weight by the con- 
stant 4. This would still give us negative weights which are awkward 
to deal with in accumulating total scores. In order to eliminate nega- 
tive weights we may add a constant of 4.0, which makes the total 
range now from 0 to 8.0 points, with a “median” (when ¢ = 0) of 
4.0 points. The new formula then becomes 


— Pu +4, (8) 
Do De 


Ww 


where W = 4b,. + 4. 

The abac shown in Figure 1 was designed to facilitate the esti- 
mation of the weight W. Where small integral weights are to be de- 
termined, a graphic solution will suffice. The margins of the bands 
are drawn at W-values of 0.5, 1.5, 2.5, and so on to 7.5. A reaction to 
any item will be found to fall within one of the bands unless it comes 
midway between two integral values, in which case it is probably 
best to choose the smaller of the two band values, that is, the one 
deviating less from the midpoint of 4.0. Only the proportions p, and 
pi need be known. The abac can be used with modifications. Some in- 
vestigators prefer to have negative as well as positive weights, as 
may be the case when machine-scoring keys are prepared. In that 
case the constant 4 may be deducted from each weight derived from 
formula (8) or read from the abac. If a smaller range of weights is 
desired, the zones may be combined in any reasonable manner. It 
should be remembered that the abac rests upon the assumptions cited 
above plus the additional assumption that the weights so obtained 
have been found reliable. The use of p, and p; automatically equalizes 
the criterion sub-groups even when one starts with more cases in the 
one than in the other. If the phi-coefficient is regarded as inapplic- 
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able, then some other estimate of the Pearson 7 should be made and 
equation (3) employed to determine the weight. Under the assump- 
tions adopted above, when the already familiar values for o, and a2 
are substituted in (3), the regression weight becomes 


Vi2 








01, = ————.. (9) 
2V De Qe 
In a form analogous to formula (8), this becomes 
i (10) 
V Dede 


Finally, we need to determine the reliability of a weight by esti- 
mating o,. But the weight adopted, W, equals 4b + 4, so the stand- 
ard error of W will be 40). Let us first determine the standard error 
of b. In his 1934 article (2) Kelley* does not give a formula for o , 
but from his derivation of his scoring weight, which is equal to b/c*, 
we can deduce «). From his equation (11), which applies to the semi- 
equalized table, 


b 4NA 
We= == [ae (11) 
in which W;, is Kelley’s scoring weight, 
A is the cell divergence, and equals (p, — p;) /4, 


and 
ng eee 
as 
Taking reciprocals of both sides of (11), 
o 1-4)? 
b 4NA~- 
Multiplying by b , which equals 4/pq , [Kelley’s equation (4)], 
ee We te 
aa pq 4NA 
me i | Fs 
~ ANpq’ 
When ¢ equals zero and so when W = 4.0, in other words to make the 
null hypothesis as applied to the scoring weight, 


(12) 


*TI am grateful to Dr. Kelley for clarifying in correspondence a point in his 
1934 article. 
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2 Petind. Bet 
we an pq’ 
and taking square roots, 
1 
HQ. (13) 
2/Npq 
The standard error of W is four times as large, or 
2 
Cow — . (14) 
VNpq 


In order to examine the reliability of any particular scoring 
weight, its deviation from a weight of 4.0 (null hypothesis) is com- 
pared with ow for the combination of N and p from which the weight 
was derived. The ratio of this deviation to its corresponding stand- 
ard error may be treated as a critical ratio or in terms of Fisher’s t 
values. Defining a significant deviation from a weight of 4.0 as one 
that could occur by sampling errors less than five times in a hundred 
and a very significant deviation as one that could similarly occur less 
than once in a hundred times, as Fisher does, one can determine, for 
various combinations of N and p, standard lists of significant and of 
very significant deviations. Tables 1 and 2 provide such lists, for N’s 


TABLE 1 
Significant Deviations from a Weight of 4.0 for Varying 
Proportions and Values of N 











porg|N 100 200 300 400 500 600 800 
.98 2.82 1.99: 1.61 1.89 125 1.14 . .99 
97 2.384 1.68 1.88 115 1.02 .95 .81 
95 1B. 2227; 2304! 290% Bl 98 «64 
.90 132 -28 -t 66: 38. 68 47 
80 20 70 BT . A948. 2 _ 35 
-70 86. .61 ..60 48 ..87- +85 -.20 
50 | 10 26 45 29 35 S2 2 








varying from 100 to 800, inclusive. From these tables it will be seen 
that a criterion population of less than 400 may give weights deviat- 
ing from 4.0 as much as 1 whole unit, more than 5% of the time quite 
generally when p is .95 or above (also .05 or below), and more than 
1% of the time when 7 is .90 or above (also .10 or below). From this 
one might lay down the general rule that criterion populations of less 
than 400 should not be used if there are many items having p equal 
to-or greater than .90.or equal to or less than .10. 
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TABLE 2 
Very Significant Deviations from a Weight of 4.0 for 
Varying Proportions and Values of N 











porg|N 100 200 300 400 500 600 800 
98 8.55 2.65 2.14 1.85 1.66 1.51 1.30 
97 . 8.10 2.16 1.76 1.53 1.35 1.24 1.06 
95 2.39 1.69 1.38 1.19 1.06 .97  .83 
.90 1.74 1.23 1.00 .87 .77 .71  .62 
80 130 92 .75 665 .58 .58 .46 
-70 1.14 80 65 56 48 .46 .40 
50 104 .74 60 52 46 .42 ~~ .36 








The question of rejection of items on the basis of weights of ques- 
tionable reliability will necessarily leave much to the judgment of the 
investigator. There is the more irksome question of what to do with 
an item which gives reliable weights for some responses to it and not 
for others. Typically, items of the kind to which these techniques 
most often apply, require responses in three categories, such as L I D 
(like, indifferent, dislike) or such as Yes? No. A tentative rule might 
be that if two of the reactions to an item yield reliable weights above 
the level of significance (Table 1) or if one of the reactions yields a 
very significant weight, the item may be retained. There is still the 
question of what to do with the reactions of lower reliability for the 
same item. One disposal might to be to weight those reactions 4.0, 
since that is a noncommittal weight. Another would be to let any 





TABLE 3 
Work Sheet for Computing Scoring Weights and Their Reliabilities 
Responses: Yes ? No ' 

Dp, (extraverts) .710 .040 (250 
p, (introverts) .305 .030 .665 
Py — Py .405 .010 —.415; 
p= (p, + P,)/2 .5075 .035 4575 
q 4925 965 5425 - 
pq .2499 .0388 2482 
gute 2 1.62 ae. eee 

pq 
Ww—4b+4 5.62 4.30 2.33 
Integral weights 6 4 2 
Vpq 5000 .1838 4982 
VNpq 10.00 3.676 9.964 

2 . 

Cy =. ( ) .200 .545 .201 

VNpq © 


t= (4b/ow) 8.1 s« = 
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such reaction retain any obtained weight other than 4.0 with the idea 
that on the whole it is the best weight that we know. Here, again, the 
investigator has latitude for his own judgment, and he may arbitrari- 
ly draw a line below which to reject all weights. 

As an example of how the computations may be conveniently car- 
ried out, see Table 3. The two sub-groups were the extreme quarters 
in a distribution of introversion-extraversion scores. N for the sub- 
groups combined was 400. 
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SOME COMPARISONS OF THE MULTIPLE-FACTOR AND THE 
BI-FACTOR METHODS OF ANALYSIS 


FRANCES SWINEFORD 
THE UNIVERSITY OF CHICAGO 


Bi-factor and multiple-factor analyses of the same data are 
compared in two respects. First, two criteria are suggested for de- 
termining when the factorization is adequate. This problem being 
more acute for the centroid method than for the bi-factor method, 
the latter is used primarily for comparison only. It is shown also 
that the omission from the simple structure of entries smaller than 
.10 yields a pattern which is a poorer fit to the original correlations 
than is the bi-factor pattern. Second, the second-order general fac- 
tor obtained from the intercorrelations of the primaries is found to 
be highly correlated with the general factor of the bi-factor pattern. 


Professor and Mrs. Thurstone have recently published a mono- 
graph* in which they present the factorial composition of two batter- 
ies of mental tests administered to eighth-grade children. Their small- 
er battery, of twenty-one tests, was so selected that a “simple struc- 
ture’ was expected to emerge. A simple structure is a table of correla- 
tions between tests and certain factors such that each test is signifi- 
cantly correlated with one and only one such factor.} These factors 
are the normals to the hyperplanes defined by the primary vectors, 
which are the ultimate reference axes. The twenty-one tests did, in- 
deed, reveal a simple structure with only five additional entries (out 
of a possible 126 entries) as great as .20. 

These data form the basis of two points of investigation. The 
first problem which will be considered is the extent to which factor- 
ization should be carried. How can one tell when enough centroid 
factors, for example, have been calculated? How does the centroid 
method compare with the bi-factor method in this respect? The sec- 
ond problem to be discussed is the relationship between the Thur- 
stones’ “‘second-order” general factor, which is a Spearman general 


_ *L. LL. Thurstone and Thelma Gwinn Thurstone. Factorial Studies of Intel- 
tg Psychometric Monographs, No. 2. Chicago: University of Chicago Press, 

+ These factors have been called “simple factors,” to distinguish them from 
“primary” factors,” in Karl J. Holzinger, assisted by Frances Swineford and Har- 
ry Harman, Student Manual of Factor Analysis, pp. 68-78. Chicago: Statistical 
Laboratory, Department of Education, University of Chicago, 1937. 
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factor computed from the intercorrelations of the primaries instead 
of from those of the original tests, and the general factor determined 
by the bi-factor method. 

The problem of the degree to which factorization should be car- 
ried is one of long standing. Its solution is not forthcoming in the 
present discussion, but certain aspects of it will be pointed out. The 
problem varies with different factorial methods. In the centroid solu- 
tion, for example, successive factors are removed until the investi- 
gator is satisfied that all the significant linkages among the variables 
have been accounted for. In a bi-factor solution the number of fac- 
tors is fixed, but modifications of the predetermined plan can be made 
to include many small and doubtless insignificant entries in the factor 
pattern. ‘ 

Considering first the centroid for Thurstone’s twenty-one tests, 
it must be noted that three tests were selected to measure each of 
seven primary factors. The centroid solution, therefore, should be 
carried through at least seven factors. An eighth factor, which would 
allow for a possible additional primary, was computed by the authors. 
Now suppose these tests to have been selected without previous know]l- 
edge of their factorial composition. In an earlier volume,* Thurstone 
indicates several criteria for judging when to stop factoring. He sug- 
gests first that one continue to extract factors “until one can ascer- 
tain by mere inspection of the residuals that more than enough factors 
have been extracted” (italics mine). Noting that this procedure is 
extremely laborious, he then proposes comparing “the dispersion of 
the residuals with the probable errors of the given coefficients, but 
this is an uncertain criterion because the dispersion decreases only 
slightly for each factor after the first two or three factors have been 
extracted.” He also shows that the range of the entries in a column 
indicates the maximum value that can be obtained upon rotation of 
axes. A column having a small range and only a few entries at its 
extremes would be considered unimportant. Finally, Thurstone pre- 
sents a formula, empirically derived, for determining at what stage 
the limit of factorization has been reached. The data necessary for 
applying this formula not having been included in Factorial Studies 
of Intelligence, it has not been feasible to test it. 

There are, however, additional aids for determining the limit of 
useful factorization. When not all the factors have been removed, 
there are some large residuals remaining. These usually occur among 
the variables that are relatively closely associated. In other words, 
some relationship exists between the original correlations and the 


*L. L. Thurstone. Primary Mental Abilities, pp. 65 ff. Psychometric Mono- 
graphs, No. 1. Chicago: University of Chicago Press, 1938. 
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corresponding residual correlations. This relationship has been ex- 
pressed as a correlation coefficient in Table 1. 
TABLE 1 


Correlations between Zero-Order Correlations and 
Successive Tables of Residual Correlations 


First factor removed - - - - - .828 + .015 
Second factor removed - - - - - .586 + .031 
Third factor removed - - - - - .480 + .038 
Fourth factor removed - - - - - .810 + .042 
Fifth factor removed - - - - - .228 + .044 
Sixth factor removed - - - - - .100 = .046 
Seventh factor removed- - - - - .0382 + .046 
Eighth factor removed - - - - - .040 + .046 


The correlation for the sixth factor removed would not be con- 
sidered statistically significant, although the chances are 18 to 1 that 
it is greater than zero. There is no question that the correlation for 
the seventh factor removed is insignificant. The eighth factor, which 
was computed and presented in the monograph, was termed a “resi- 
dual factor” and disregarded in the discussion. This procedure, then, 
may prove useful in determining the limit of factorization. 

In the case of the bi-factor method the problem is somewhat dif- 
ferent. Here, the number of factors is predetermined and only those 
factor weights are computed which are expected to be significant, in 
contrast to the centroid method, wherein every cell in the factor pat- 
tern is filled. With the omission of a relatively large number of pat- 
tern weights, even though each may be insignificant, there remains 
some correlation between the residuals and the original coefficients. 
For the Thurstone data, an unmodified bi-factor pattern was first at- 
tempted, the test groups corresponding to those of the authors. A 
correlation of .593 was obtained not for 210 final residuals, but for 
the 189 which cut across the groups, and which remained as final resi- 
duals when the general factor only was removed. Examination of 
these residuals revealed a number of overlaps between tests in dif- 
ferent groups. The modified pattern (Table 4), having twelve more 
entries than the original one, gave rise to only 149 final residuals 
when the general factor was removed. These are correlated with the 
original correlations to the extent of .395 + .047, and the correspond- 
ing correlation for the 210 residuals with general and group factors 
removed is .305 + .042. Now it is not possible at this writing to set 
down any standard for such correlations. Those for the present data 
are statistically significant. They have been included only for pur- 
poses of comparison, and would have no function in the process of 
obtaining a bi-factor solution, all pattern plans being based upon a 
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priori considerations and upon the tables of residual correlations. 
Another criterion for judging the degree of factorization has 
been mentioned elsewhere.* Some factor analysts look upon the resi- 


TABLE 2 
Skewness of Distribution of Residuals from 
Centroid and Bi-factor Solutions 





























| Standard Skewness 
Residuals Mean Median Deviation | Sk—=3(M—Md)/o 
Centroid: 
I removed —.0150 —.0342 1164 49 
II removed —.0105 —.0162 .0852 .20 
III removed —.0082 —.C166 -0660 38 
IV removed —.0060 —.0124 .0514 : OT 
V removed | —.0040 —.0069 .0394 22 
VI removed —.0030 —.0048 .0282 19 
VII removed —.0015 —.6013 .0192 —.03 
VIII removed —.0011 —.0019 -0160 15 
Bi-factor: 
Initial pattern | 
(189 residuals) .0024 —.0026 .0692 wee 
Modified pattern 
(210 residuals) | .0029 .0034 .0411 —.04 





duals as random deviations from zero. If they are, then their dis- 
tribution might be expected to approximate the normal curve. Cer- 
tainly, the writer has noted many times that when the factors have 
not all been included in the bi-factor plan, then the distribution of 
residuals is positively skewed. This situation is obvious, for the 
method of solution is an averaging process such that the mean resi- 
dual approximates zero. Any large residual due to an overlap not 
planned for is balanced, therefore, by a number of small residuals of 
opposite sign. Since mental data yield generally positive intercorre- 
lations and positive factor weights, the above-mentioned large resi- 
duals are positive and the small ones negative, thus resulting in the 
positively skewed form of distribution. 

The skewness has been computed for the successive tables of resi- 
duals as the centroids are removed and for two tables of bi-factor 


residuals. The formula employed is Sk. = 7 . with stand- 
Co 


ard deviations corrected by Sheppard’s formula. These data are giv- 


* Frances Swineford and Karl J. Holzinger. A Study in Factor Analysis: 
The Reliability of Bi-factors and Their Relation to Other Measures. Supplemen- 
tary Educational Monograph. Chicago: Department of Education, University of 
Chicago, 1942 (in press). 
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en in Table 2. The measures of skewness are all positive until the sev- 
enth centroid has been removed, at which point the skewness becomes 
virtually zero. The bi-factor method, likewise, shows a positively 
skewed distribution of residuals when some factors remain unaccount- 
ed for, but not when factorization has been adequate. 

It has been pointed out that in the centroid method of factor so- 
lution a value is computed for every cell of the pattern and that in 
the bi-factor method only the significant entries are computed, thus 
eliminating a great amount of unnecessary labor. Once calculated, 
the centroid reference axes are then rotated to a simple structure (de- 
fined earlier), in which usually only the significant weights are con- 
sidered. The presence of this large number of very small values in 
the centroid solution makes possible the excellent fit to the original 
correlations. In fact, the standard deviation of the residuals with the 
seventh centroid removed is only .0192, while the standard deviation 
of the final residuals for the modified bi-factor pattern is .0411 (see 
Table 2). Since only the significant entries are employed in the dis- 
cussion of the simple structure, the question arises as to the adequacy 
of fit when the insignificant entries are omitted. 

In order to answer this question, the correlations reproduced 
from the structure must be obtained. In the Thurstone monograph* 
the structure is presented, for purposes of interpretation, with all val- 
ues less than .20 omitted. Lacking any formal means of testing these 
values for significance, the writer decided to “play safe,” and omit 
only those entries which are less than .10 in absolute value. The struc- 
ture and the factor intercorrelations+ were employed to obtain a pat- 
tern (Table 5) by the method described by Holzinger and Harman.t 
Matrix multiplication of the structure by the transpose of the pattern 
gives the reproduced correlations, which are then subtracted from the 
original correlations. The standard deviation of these residuals is 
.0692, a significantly greater value than that obtained for the bi-factor 
pattern (.0411). 

The second problem which is the subject of this paper is the na- 
ture of the second-order general factor obtained by Thurstone. The 
seven primary factors identified by his data may be represented by 
vectors each of which passes through a cluster of tests. These vectors 
are not mutually orthogonal, but have angular separations of less than 
ninety degrees. That is, they are positively correlated. The Thur- 
stones employed their intercorrelations to compute a single factor 


* Factorial Studies of Intelligence, p. 35. 

+ Ibid. Table 4, p. 31. 

{Karl J. Holzinger and Harry H. Harman. Factor Annalysis, pp. 386-89. 
Chicago: University of Chicago Press, 1941. 
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present in all primaries. This factor they call a “general intellective 
factor.” The bi-factor solution contains a general factor plus seven 
group factors, all mutually orthogonal (uncorrelated). Thus in the 
geometrical interpretation, the simple structure of the Thurstones lies 
in a space of seven dimensions, and the bi-factors lie in an eight di- 
mensional space. In the paragraphs to follow, an informal compari- 
son of the two general factors will be made. 

The second-order general factor is expressed as correlations with 
the primary factors, each of which is identified by three tests. Corre- 
lations of the bi-factor general factor with composites of these groups 
of three tests have been computed from the bi-factor pattern by means 
of Spearman’s formula for the correlation of sums. The Thurstones 
omitted their perceptual factor, P, from this part of their analysis 
because they have found it to be a relatively unstable factor. Our 
comparison must therefore be based upon the results for six primar- 
ies. In order to render the two general factors comparable, it was de- 
cided to compute the correlations between that of the Thurstones and 
the six composite tests. 

The method of obtaining these correlations will be described 
briefly. First, regression equations were derived for estimating the 
bi-factor g and each of the six primaries (omitting P) from the six 
corresponding composite tests. The regression coefficients are listed 
in Table 3, together with the multiple-correlation coefficients. Next, 
a regression equation was obtained for estimating the Thurstone sec- 
ond-order g , which will be denoted by g’, from the primary factors, 
as follows: 


g' = .076M + .183V + .257W +.022S + .111N + 520k. 


The primary factors are in standard form. The multiple correlation 
is .921. 
TABLE 3 
Regression Coefficients for Estimating the Bi-factor g and 
Six Primary Factors from Six Composite Tests 


eaten 























Regression Coefficients 

Composite Test Boi Byi By; Bwi Bsi Byi Bri 

Zu .065 742 022 .025 | —.002 | —.012 .064 

Zy 092 .082 959 .015 | —.028 .009 .054 

Sy 191 081 039 848 012 .064 | —.014 

Ze .059 | —.035 | —.020 | —.028 899 | —.044 074 

Zy .235 | —.065 | —.014 074 .058 872 .095 

Zp 498 .026 | —.014 .044 .069 038 778 

Multiple-correlation | 

coefficient 885 -799 971 .918 .936 910 914 
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Noting that g’ = g’ + e, we may calculate the correlation of g’ 
with each composite test, 7, by multiplying both sides by z; and sum- 
ming over N cases. The variables g’ and z; are all in standard form. 





— =P = 076ruz, + .188ry:, + 257 we, 


+ 022rs-, + 111Tyz, + 520Prrz, + 0. 


These correlations, together with the composite-test intercorrelations 
can now be employed to compute the regression equation for estimat- 
ing g’ from the composite tests. This equation follows: 


g = .100zy + .2142y + .2312y + .040zs + .1592y + 41922. 


The multiple correlation, R,, ,is .892. 

If it be assumed that the errors of estimate are uncorrelated, the 
correlation between the factors, me is .780, and the correlation be- 
tween their estimates, Ps is .989. To the extent that the errors of 


TABLE 4 
Bi-factor Pattern 
(Unique factors omitted) 



































Factor 
Test | g | P’ M’ | Vv’ Ww’ S N’ R’ 
1 416 | at oe ae = | BBB ane 
2 538 i Eg, Tea amen Ce. 2 
3 627 | a eg ee le fo eee ee Gee 
a-E)6 Oe 5e0 | 191 | 185— | ee Ress 
S 1 BS T- w a SR BP Mes. 4. Sad ea 
6 | .868 — x A og eee eer ese ao Fk ee 
7 > gpme a ee apne Geen 
8 ' See | oie , 22 hUchea oe |. angie sali 
9 a ae ae | 544 | 067 |... Beets Ae 
10 —_ | a _ ST abies ee a oe eee 
11 MeN ice = Pe SS rae umes abs 
12 “ 2 ee Ane ee 32 2 Ree eee oe 
13 . a ao | Ares ney wae . 2 ees ee 
14 | ie eee | re Bs Sa ae ee ee 
15 os Re» cata ee _ - ae) a 
16 eek oe vee de i ae, Bplay jee" ; ae 
17 ao | i Aeon | peeks pee MF, i 
18 a ee ORS gee ae 2056+) 361 | ..... 
19 i Weta? Reo lah ees geek | rca wakes 466 
20 a E Toeees ee ae ORE age *: 352 
21 714 etiam Or | eG, ee ee ae | 170 
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estimate are not independent, these correlations may be still higher. 
It is thus clear that the Thurstones’ second-order general factor is 
very similar to the general factor obtained by the bi-factor method. 


TABLE 5 
Simple Pattern Computed from Simple Structure with 
Entries Less than .10 Omitted 
(Unique factors omitted) 





























| Factor 
Test | P” oi w" s” N” R’ 
1 | 4 20 | .44 A4 39 78 16 
2 82 ae 32 58 50 68 
4 ae 19 | 82 43 25 27 48 
4 20 16 41 39 17 24 57 
5 37 | 44 23 19 27 03 34 
6 17 | .79 42 A8 16 35 51 
7 41 AT | 1.24 68 25 54 99 
8 30 38 | 1.10 56 17 41 70 
9 39 42 | 1.16 60 | .36 47 81 
10 38 38 59 | 1.07 23 | 51 68 
11 20 35 52 | 1.00 16 | .46 56 
12 23 36 68 89 16 45 | .60 
13 39 13 17 17 | .89 25 | .45 
14 44 15 | .19 19 99 28 | .50 
15 41 144] 1s | .18 94 | 26 | 48 
16 23 17 40 | .48 23 | 100 | .65 
17 24 18 42 50 25 | 105 | .67 
18 39 24 | 46 48 51 | .88 | 87 
19 45 35 | 66 | 49 | 25 | .68 | 118 
20 AT 42 | 83 59 | 35 58 | 1.21 
21 | 40 | 35 | 55 | 60 | 21 | 52 | 1.05 
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SOME STATISTICAL OPERATIONS ON THE 
COUNTING SORTER 


PHILIP H. DUBOIS 
UNIVERSITY OF NEW MEXICO 


The IBM card counting sorter may be readily used for a number 
of statistical operations independent of the tabulating unit. In this 
article, use of the counting sorter in computing the mean and standard 
deviation with 2-digit variables and the coefficient of correlation with 
scores coded in a single column will be described. Both processes in- 
volve machine work which is essentially the same. A method for find- 
ing the correlation between 2-digit variables will also be given. 


Computation of the Mean and Standard Deviation 
The 2-digit variable X may be represented as 10A + B, in which 
A is the tens-digit and B is the units-digit. 


If X=10A +B, 
then SX=—105A+ 3B, 
X?= 100A? + 20AB + B?, 
and >X?= 100A? + 20SAB + SB’. 


If we find N, SA, SA’, SB, SB?, and SAB, all the data necessary 
for obtaining the mean and standard deviation become available. 

Table 1 illustrates the work. The data are the scores of 2645 high- 
school seniors on a mathematics test in which the range is from 0 to 
53. The steps in the process follow: 


1. Prepare a two-way table labeled from 0 through 9 on 
the x-axis and 0 up through the highest A on the y-axis. 
2. Sort all cards on the tens-digit or A column. The cards 
are removed from the pockets and kept in separate groups. 
3. Adjust the machine to sort and count on the units-digit 
or B-column. 

4. Sort the highest A-group of cards, in the numerical ex- 
ample, the 5-group. Record the frequencies in the counters 
in the top row of the table. Record the total so far under Cf, 
in the column at the right. Do not clear the machine as the 
frequencies must be allowed to accumulate. 

5. Repeat with the other A-groups, recording the readings 
in the ten counters, and the total counter row by row. Only 
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after cumulated frequencies for the O-group, which are the 
B-frequencies, are recorded and N noted from the total 
counter can the machine be cleared. 
6. Cumulate the f;’s down through zero and record under- 
neath the diagram. The last Cf, is N. 
7. Enter 1 in a calculating machine and multiply by the 
Cf, in the 1-row, in the example, 2547. Allow both products 
and multipliers to accumulate. Change the 1 to the next high- 
er odd number, 3, and multiply by the Cf, of the 2-row. The 
proper multiplicands have been indicated in a column at the 
right of the diagram. Continue through the top Cf,. The 
accumulation of the multipliers is SA, or 5003 in the ex- 
ample, and the accumulation of the products is }A?, which 
is 12469 in the example.* 
8. Similarly after writing the multiplicands or successive 
odd numbers below the Cf;’s beginning with 1 under the 1- 
column, find SB and SB’, in the example 12081 and 77263. 
9. Find the sum of the entries in the 9-column excluding 
the entry in the 0-row and enter in the CSCf,” row. The 
sum in this case’is 388. Take this sum as a sub-total if an 
adding machine is used or do not clear the dials if a calcula- 
tor is used. The entries in the column are the cumulative A- 
frequencies grouped according to B and may be designated 
as the Cf,” ’s. If we do not clear the adding machine from 
column to column, we cumulate the }Cf,"’s toward the 
origin. 
10. Repeat with the other columns. The CSCf,” for the 0- 
column is computed only as a check. Itis SA. 
11. Sum the CDSCf,"’s, excluding the entry in the 0-col- 
umn. The sum is SAB, in the example 20701. 
12. Compute SX and SX? by the formulas given above and 
the mean and standard deviations by the usual raw score 
formulas, 5x 

M NT? 





1 
sa VNZX? — (LX)?. 


A check on the work is illustrated in Table 2. Here the cards 
have first been sorted on B and then by groups sorted on A. From 
the total counter the Cf,’s are read. The CSCfs" ’s are found by add- 
ing the columns beginning in the highest A column but never adding 
into the 0-row and by not clearing the machine from column to col- 
umn. The CD>Cf;,” for the 0-column is SB and the sum of the other 
CD>Cf,”’s is SAB. If this check is to be made, SB and SB? may be 

*For an algebraic proof of the computational principle involved, see Du 


Bois, Philip H. A statistical time-saver for means and sigmas. J. Cons. Psychol., 
1939, 3, 80-82. 
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computed from the cumulations made by the sorter rather than cumu- 
lating them by hand as in Table 1. 

The process of computing the SAB will be readily understood if 
it is remembered that when the process illustrated in Table 1 is fol- 
lowed, each card punched with a given AB is represented A times in 
its column and therefore A times in the SCf,” of the column. Since 
the SCf," ’s are cumulated toward the origin and the column is B 
steps from the origin, the A appears B times in the EC>Cf,". The 
same thing happens in the check process except that columns and 
rows are reversed. 

When N is over 400 the counting sorter method for finding M 
and o compares favorably in time required with the method involving 
both sorter and tabulator, and with very large N’s is considerably 
faster if only one diagram is made. More arithmetic is involved, how- 


TABLE 1 


Showing all steps in the computation of =X and =X? for a 2-digit variable, 
Counting Sorter Method. The data are the scores of 2645 high-school seniors on 
a mathematics test. Cards, sorted on A, have been re-sorted by groups on B. 
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273 | 242 | 259| 263| 246| 275| 268 | 261| 284 |274|f, | 2645—N 


cee | ees | ry | pe f | 


2645 | 2372 | 2130 | 1871 | 1608 | 1362 | 1087 | 819) 558 | 274 Cf, 


1| 3! 5) 7! 9 a1| 138] 15| 17/m 
5008 4370 3815 3279 2746 2255 1754 1264 830 388 C=Cf,” 





’. as | 

(Tens- | B (Units-Digit) | 

Digit) |_0| 1) 2) 3) 4) 5) 6] 7 8) 9] | Cf, | m_ 
5 unis «ee -7 #: 4 3 60 | 9 
4 39| 37; 24| 22] 24] 19| 26] 16] 19! 12] | 238] 7— 
3 | 95| 81| 72| 62| 61, 62| 61| 48| 40] 39| | 621 | 5 
2 | 216| 181| 178| 181| 157| 140| 137| 121| 128/103) | 1537 | 3 
1 | 272| 241) 256| 262| 245) 273| 260| 245|259\234| | 2547 | 1 
0 





















































N = 2645 =X= 102A + =B 
=A = =Cf, = 5003 = 50030 + 12081 = 62111 
2A2 = =mCf , = 12469 =X2= 1002A2 + 202AB + =B? 
=B = =Cf, = 12081 = 1246900 + 414020 + 77268 
=B? = =mCf, = 77263 = 1738183 
ZAB==C2Cf,” = 20701 M = 23.48 


o = 10.28 
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TABLE 2 


Showing check on computation of =AB 
(Data are the same as in Table 1) 




























































































Wate: | A (Tens-Digit) 

Digit) sy a) - 4) 8 at Cf, | m 
9 40| 181| 64| 27| 12| | 274 | 47 
8 65| 267| 147| 48| 30| 1. 558 | 15 
7 81/ 391| 220| 80| 42| 5] 819 | 13 
6 89 514| 296 | 115| 62| 11) 1087 | 11 
5 91| 647| 374 | 158| 74| 18| 1362] 9 
4 92| 735| 470| 195| 94| 22) 1608 | 7 
3 93) 816| 589 | 235/110 | 28| 1871 | 5 
2 96 894| 695 | 283/128 | 34| 2130] 3 
1 97 | 954 | 795 | 327 | 150° 49 2372 | 1 
0 98 | 1010 | 916 | 388/178 | 60| f,| 2645—N 

12081 | 11337 | 5988 | 2338 | 870 | 168 | C2Cf,” 




















ECECf, — ZAB = 20701 
2Cf, ==B= 12081 
=mCf , = =B2 = 77268 


ever, and its greatest advantage is in a small punch card installation 
which does not include the tabulating unit. The method is especially 
convenient when the cards contain data, such as questionnaire items, 
that are best handled on a sorter and when M and o of a group of 
scores are to be found incidentally. 


Correlation of Single-Digit Variables 


Table 3 illustrates the complete process of obtaining the Pearson 
coefficient of correlation with scores for each variable coded in a sin- 
gle column. For the L variable a step interval of 7 is used and for 
the Q variable a step interval of 6. The twelve positions in the col- 
umn are used for twelve of the steps and the thirteenth step is indi- 
cated by passing over the column, the frequencies being read from the 
reject counter. Thirteen is the maximum number of steps that can 
be handled by this method. 

In this table the cards have been sorted on L and then the sepa- 
rate groups re-sorted on Q beginning with the R group. Group by 
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group the cumulative frequencies are recorded. The Cf, entries are 
read from the total counter. 

Computational procedures for solving for r are exactly the same 
as in computing SX and >X? for 2-digit variables. The arbitrary 
origins are the mid-points of the lowest steps. SL and SL? are ob- 
tained simultaneously by multiplying successive odd numbers, begin- 
ning with unity in the step above the arbitrary origin, by the Cf;,’s. 
The accumulation of the multipliers is SZ and of the products SL’. 

The Q-frequencies in the bottom row are cumulated toward the 
origin and a similar process yields SQ and SQ’. 

Beginning with the right-hand column, the cell entries are 
summed, but the machine is not cleared from column to column. The 
C>Cf,’s are recorded at the bottom of the diagram. The CDCf,” in 
the 0-column is a check on SL. The sum of the other CDCf,"’s is 
SLQ. Then 


ue N>CSCf,” par SCfiSCho 
VNSmCf, — (SCHL)? VNSmCfy — (SCho)? 
ee NSLQ — SLEQ 
VNSI? = (SL)? VNSQ?— (3Q)? 


Table 4 illustrates the check process which is exactly analogous 
to the check process in Table 2 for verifying steps in computing >X 
and }X?. From this work }Q , }Q?, and SLQ may be conveniently 
found. 

If a number of intercorrelations are to be computed by this pro- 
cess, one diagram is necessary for each r. With 400 cards a diagram 
can be made and the sum of the cross-products obtained in about ten 
minutes. For each variable >X and »X? need to be found only once. 

Correlation of Two-Digit Variables 

If two 2-digit variables have been punched in cards, it is possible 
to obtain the coefficient of correlation by an extension of the method 
used in finding SX and }X?. Let A be the tens-digit and B the units- 
digit of X, and C the tens-digit and D the units-digit of Y. Formu- 
las and computational procedures have already been given for >X 
and SX?. SXY = 100SAC + 10S AD + 105BC + SBD. The ordi- 
nary raw score formula for 7 is used. Six diagrams in all are neces- 
sary—four to be used in computing SXY and two for computing the 
sums and sums of the squares of the two variables. 

To find the sum of the cross-products of any two digits, the cards 
are arranged on one of the digits and then by groups re-sorted on the 
other. At the end of the re-sort, the cards have been arranged for a 
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new sort by groups. For the four digits of this type of correlation 
problem, the work can proceed as follows: 


Sorted on Re-sort by Groups on Yield 


A B Z=AB 
B C =BC 
C D =CD 
D A xAD 
A C ZAC 
(New sort) B D Z2BD 


When N is large and the data are already punched in the cards, 
the procedure may be useful. It is never recommended when a tabu- 
lator is available. 
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CLINICAL PSYCHOLOGY—ART OR SCIENCE*} 


THEODORE R. SARBIN 
UNIVERSITY OF CHICAGO 


This paper questions the oft-repeated statement that clinical 
psychology is an art by examining the main functions of clinical 
psychologists, i.e., diagnosis and treatment. In examining the concept 
of diagnosis, evidence is presented which supports the notion that a 
diagnostic statement has meaning only when it has a referent in 
the future—when it provides a prediction. A prediction (probability- 
statement) is determined empirically and may be stated in terms of a 
pa gon equation or in terms of a crude generalization from clini- 
cal experience. Treatment likewise is determined by tacit or ex- 
pressed predictions of behavior under alternative conditions. The 
various conceptions of art as applied to clinical psychology are ex- 
amined and the conclusion is drawn that clinical psychology is a 
scientific as opposed to an artistic or intuitive enterprise. 


The problem under discussion is in need of clarification for at 
least two reasons. First, the manner in which we provide training for 
the growing number of clinicians will depend largely upon our con- 
ception of clinical psychology as art or science. If our analysis shows 
it to be an art, then we are faced with the admittedly difficult task of 
training people in various forms of art. Second, the increasing interest 
in clinical research creates problems around the art-or-science ques- 
tion. Research based on artistic conceptions may have to be interpret- 
ed differently from research based on scientific conceptions. Incidental 
to the discussion of this theme, we shall attempt briefly to clarify the 
concepts of diagnosis and treatment. 

That we are dealing with a real and not an imaginary problem 
may be inferred from a casual perusal of the writings of prominent 
clinical psychologists. Louttit (7) and Westburgh (17), for example, 
have expressed themselves categorically on this point with the declara- 
tion that the practice of clinical psychology is an art. They imply that 
the clinician, by virtue of some artistic gift, adds something to the 
objective measures obtained by scientific means. 

Perhaps the most insistent question in this inquiry has to do 
with the definition of the field of clinical psychology. The activities 
of those who profess to be clinical psychologists are so diverse, so 
heterogeneous, that we cannot be content with an answer to the simple 

* Paper read at the 49th annual meeting of the American Psychological 
Association at Evanston, Illinois, September 3-6, 1941. 


} This analysis may be applied with little modification to the fields of social 
work, psychiatry, vocational guidance, and related professions. 
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question: “what do clinical psychologists do?” As a matter of fact, 
psychologists disagree among themselves as to the delimitation of the 
field. Louttit (7) has shown that there are at least four conceptions 
of clinical psychology. The interpretation most acceptable to this 
writer (and to many of his colleagues) may be stated as follows: clini- 
cal psychology consists of those activities necessary for the diagnosis 
and treatment of individuals who require assistance in the solution of 
their social psychological problems. Diagnosis and treatment, accord- 
ing to this definition, are the chief functions of clinical psychologists. 
Parenthetically, it should be added that in most clinics the psychologist 
serves only a diagnostic function, the therapeutic function being dele- 
gated to psychiatrists, teachers, or social workers. For the present we 
shall be concerned only with the concept of diagnosis, delaying analy- 
sis of the treatment phase until later. , 

Our task now is to determine whether diagnosing is artistic or 
scientific. Before we can do this, we must clarify the referent for 
which the term “diagnosis” stands. A search of the literature reveals 
that diagnosis has two principal meanings. The first takes the form 
of a literary description of an individual from a sampling of his 
present and past behaviors. Such descriptions are to be found in his- 
torical research, in literary efforts, and in places where mere descrip- 
tion of an individual or event is the aim of the writer. 

A second meaning of diagnosis likewise is that of description of 
an individual, but with a future referent. Here the main purpose is to 
provide a statement of prophecy from a statement of status; that is 
to say, to uncover the events of the past which will serve as predictive 
indices for the future. The clinical psychologist, like the physician, 
is interested in the past only as it reveals vectors into the future. An 
example from a case history will clarify the distinction. The results 
of examination and history-taking reveal that Jack Smith, age 10, is 
in the fourth grade but that his reading skills are equivalent to those 
of the average second-grade pupil. The case notes might state the 
diagnosis as follows: Diagnosis—two year retardation in reading 
achievement. If this sentence carried no more meaning than that of 
the present state of the individual with regard to his reactions to 
printed words, it would be a literary description. But the clinician 
implies something more—he implies that this current behavior will 
lead to certain kinds of behavior in the future. More specifically, the 
diagnosis of reading disability in Jack Smith means that unless some- 
thing is done about it, Jack will become an academic failure, perhaps 
engage in anti-social forms of conduct, and do many of the other 
things which are associated with academic failure. This clinical diag- 
nosis has a referent in the future. As such it is a prediction. The 
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clinician makes this tacit prediction: if nothing is done to remedy the 
reading disability, the boy will probably fail in school. 

As used in the applied sciences of medicine and psychology the 
term diagnosis usually implies prognosis—not only the description of 
the disease but also the course that it will take, not only a statement 
of the social psychological conflict but also the behavior of the indi- 
vidual in the future. The interest in prediction differentiates the 
applied scientist from the literary artist. Without a system of knowl- 
edge of the course of the particular disease in similar cases, the 
physician cannot make a predictive diagnosis: all he can do is describe 
the symptoms, i.e., make a literary diagnosis. The psychologist faced 
with a problem child is likewise thwarted by the absence of knowl- 
edge; if he has never had experiences with similar symptom-com- 
plexes he can do no more than write a literary description of the 
child. 

Rogers (13) recognizes this need for diagnoses which have re- 
ferents in the future when he writes: “The fact-finding and classify- 
ing aspects of diagnosis have their place, but in dealing with the 
individual child there must be a diagnosis which goes deeper, dis- 
covers the meaning of the various elements, and points the way toward 
treatment.” 

This discussion leads to the conclusion that diagnosis, in order to 
be meaningful, must have a referent in the future. A clinician’s diag- 
nosis is static, i.e., purely descriptive of past events without implica- 
tions for the future, when it has no function in terms of treatment. 
His diagnosis is meaningful and dynamic when it provides a probabil- 
ity-statement of future behavior under alternative conditions. 

All this means that whenever a clinical psychologist formulates 
a meaningful diagnosis, he expresses or implies a prediction. The 
next step in our examination probes briefly into the nature of psycho- 
logical prediction. Psychological writers, among them Gordon Allport 
(1), Brown (4), Williamson (18), and Viteles (16), have postulated 
two forms of prediction: the statistical or actuarial method, and the 
clinical or case study method. To this point, the mathematician Reich- 
enbach (11) has clearly demonstrated (1) that all meaningful predic- 
tions are probability-statements, and (2) that the postulation of two 
forms of prediction calls for two interpretations of probability. In 
another paper (15), the author has applied Reichenbach’s treatment 
of this problem to clinical and statistical prediction in psychology. The 
conclusions in that paper point out that only one interpretation of 
probability, the frequency interpretation, is necessary in making 
clinical or statistical predictions of behavior. Predictions may be made 
on the basis of a regression equation, such as X, = b.X, + b.X,+C, 
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or on the basis of crude generalizations from previous cases. An ex- 
ample of the latter would be: if a clinician’s experience has shown an 
association of antecedents X, Y, and Z, with consequent “neurosis” 
in 80 out of 100 cases, the diagnosis of a case where X, Y, and Z were 
present could be meaningfully stated as “potential neurosis.” 

In dealing -with a series of individuals who present similar ante- 
cedents, the psychologist predicts for any single individual on the 
basis of relative frequency of occurrence in the past. If the psycholo- 
gist happens to be interested in studying only one individual, then he 
will make his predictions by ordering to a class similar behavior-seg- 
ments of that individual and formulating a probability-statement in 
the same way—namely, on the basis of the relative frequency of oc- 
currence. (The term teleonomic has been used by F. Allport (2) to 
designate the prediction of behavior for an individual on the basis of 
a series of observations of that individual.) 

It is submitted, then, that a diagnosis, to be meaningful, must be 
predictive, and that predictions are the result of statistical generali- 
zations. Since prediction is considered the hallmark of science, we may 
safely conclude that diagnosis is a scientific enterprise. This conclu- 
sion will immediately draw fire from clinicians of all schools of 
psychological thought. They will maintain stoutly that statistical 
generalizations destroy vital and dynamic elements gathered in the 
case study. We might answer that these so-called dynamic elements, 
though interesting from a literary point of view, are scientifically 
meaningless unless they have predictive value. If these factors cannot 
be shown to have some observed association with the criterion, they 
are utterly useless. 

Among sociologists, the problem of the relative merits of the 
case study and statistical generalization has been a subject of pro- 
longed discussion. Lundberg’s remarks may profitably be quoted here: 


iff 


... case studies become significant scientifically only when 
they are classified or summarized in some way so that the 
uniformities in large numbers begin to stand out and group 
themselves into general patterns or types. It is in this process 
of summarization, without which a large number of case 
studies are practically useless for scientific purposes, that 
the statistical method, in cruder or more refined form, is not 
only useful but absolutely necessary.” (8) 


The present author agrees with Lundberg in that useful diagnoses 
always proceed from generalizations, whether based on a rigorous 
statistical method or upon a crude empirical method which has been 
variously named intuition, insight, verstehen, etc. When a clinician is 











THEODORE R. SARBIN 395 


put to the test to defend a diagnosis, he may resort to the statement 
that it was “the general feel of things” in the interview that influenced 
him. By pushing him back, however, it is possible usually to discover 
the empirical basis for the diagnosis. That these inferences are in- 
formal and not made with the benefit of Hollerith cards and Monroe 
calculators is beside the point. They are drawn from the clinician’s 
cumulative experience. If they are not, then the diagnostic function 
must be relegated to individuals with some sort of magical power. 
“Thus the only possible question as to the relative value of the case 
(or clinical) method resolves itself into a question as to whether the 
classification of, and generalization from, the data shall be carried 
out by the informal, qualitative, and subjective method --- or the 
systematic, quantitative, and objective procedure of the statistical 
method.” (8). 

At this point the critic will hold up his hand and bid us go no 
further: All that you say is true, he tells us, if you accept the postulate 
that clinical psychology is a science. But (and I quote from West- 
burgh’s text) “Clinical psychology is an art—not a science. The 
clinical psychologist uses those scientific findings and techniques 
which are applicable to his clinical problems . . . Then even while he 
is developing such a complete personality study, he engages upon the 
genuinely artistic task of helping the patient to solve his own prob- 
lem.” (17) (italics added). 

This expression, genuinely artistic task—without further defini- 
tion—leads us into a morass. The possible meanings for the words 
art and artistic as used here are: (a) skill in the use of tools; (b) 
individual explorations into the unknown; (c) possession of a unique 
talent or gift; (d) so-called intuitive operations. 

(a) If art means the skillful use of tools, then we must ask, 
whence come these skills? It is unnecessary to elaborate on the point 
that skills are acquired from experience with tools. For example, if a 
clinician can make ingenious predictions of social adjustment from 
the perusal of certain psychological tests, he would be demonstrating 
his skill. Such predictions are obviously made against a background of 
previous experience with psychological tests and social behavior. With 
this conception, the writer has no quarrel. It does not postulate a 
super-empirical method of understanding. It is not therefore, a 
material departure from the proposition that clinical psychology is 
scientific in that predictions are made on the basis of empirical data. 

(b) If art means individual explorations into the unknown, we 
have no way of checking on the validity of predictions formulated in 
the name of art. If a clinician should make a diagnosis and prescribe 
treatment for a case that was unique, idiosyncratic, in every conceiv- 
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able way, hd would be venturing into the unknown. He would be 
guessing. This would be an expression of personal taste. If the clini- 
cian had no experiential background, no knowledge of similar cases, 
then he would be making a truly individual prediction. Unless such a 
single prediction is ordered to a class of events, it cannot be verified 
and is, therefore, meaningless. 

(c) If, in this context, art means the possession of a gift or talent 
for “making friends and influencing people,” then we can look for 
little progress in the field of clinical psychology. If clinical psychology 
is an art because some clinicians possess unique traits, and if complex 
human problems can be solved only by these specially-gifted people, 
then we must agree with Rogers and “admit that we can never deal 
in any large way with the multitude of ills which we. group together 
as conduct problems, since the talents of the artist can be little con- 
veyed to his fellows.” (13). Recognition of this problem is also given 
in one of the most provocative books to be published recently on per- 
sonnel administration. Roethlisberger and Dickson make this gener- 
alization on the basis of the outcome of a thorough-going research pro- 
gram in personnel administration: 


“The skill (of diagnosing human situations) should be ‘ex- 
plicit’ because the implicit or intuitive skills in handling hu- 
man problems which successful administrators ... possess 
are not capable of being communicated and transmitted. 
They are the peculiar property of the person who exercises 
them; they leave when the executive leaves the organization. 
An ‘explicit’ skill, on the other hand, is capable of being re- 
fined and taught and communicated to others.” (9) 


In this connection, it should be pointed out that the so-called art of 
interviewing, long considered an implicit or intuitive skill, has recently 
been studied, refined, and communicated to others. Porter (10) and 
Bordin and Sarbin (3) have studies in progress which show how these 
so-called artistic skills may be taught and learned. 

(d) If art means some super-empirical method of understanding, 
then we must surrender our ideas about communicating techniques 
and procedures in clinical psychology. If we depart from the method 
of logical inference, i.e., the scientific method, then we must perforce 
adopt some so-called intuitive approach. Not inductive, not based on 
logical inference, the intuitive method of understanding is described 
by Klein as follows: 


“... (itis) the task of fathoming human motives or appreci- 
ating the entire gamut of human desires ... (it) requires 
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a knowledge of human nature. It represents the type of un- 
derstanding indispensable for the development of psychology 
as a social science or as a Geisteswissenschaft.” (6). 


The traditional methods of science, he points out, have a place in 
psychology, but the intuitive approach, characterized by the quotation 
above, is to reap the harvest in psychology. One question that is raised 
but not answered is this: how does one acquire this “knowledge of 
human nature” which is so indispensable to the social scientist? 
Windelband suggests that it cannot be acquired: 


“The psychology which the historian uses is a very different 
thing (from scientific psychology). It is the psychology of 
daily life: the practical psychology of the poet and the great 
statesman—the psychology that cannot be taught and 
learned, but is the gift of intuitive intelligence, and in its 
highest form a genuis for judging contemporary life and 
Agee This sort of psychology is an art and not a science.” 


Other writers have pointed out that these so-called intuitions, if 
they are valid at all, are the products of experience. They are infer- 
ences which are not recognized as such. Those who still cling to the 
theory that intuitive predictions of behavior are more accurate must 
show two things: first, that intuitive predictions are not formulated 
from empirically-observed data; and second, that such “intuitions” are 
more accurate than predictions made from statistical generalizations. 
The research that has been reported to date suggests that intuitive, 
artistic predictions are not more accurate than those based on statisti- 
cal and scientific concepts. Sarbin (14) has submitted evidence which 
shows that predictions of a complex social psychological activity— 
academic achievement—made by experienced clinicians are not more 
accurate than those made from a regression equation. The clinicians 
had available numerous psychological test scores, personal data sheets, 
the notes of another interviewer, and whatever data they could gather 
in the face-to-face interview. The regression equation was one in which 
two measurement variables were used to predict the criterion. In 
Wittman’s studies (20, 21), a prognostic rating scale of 30 items was 
found to be more accurate in predicting the behavior of psychotic 
patients than the use of “intuitive” psychiatric generalizations. These 
reports are cited in evidence of the greater accuracy of predictions 
made on the basis of scientific concepts. 

At this point we may briefly look into the other major aspect of 
clinical psychology—treatment. Are therapeutic efforts artistic or 
scientific in nature? If we accept the postulate stated earlier, that 
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prediction is the hallmark of science, then we need but apply the 
same logic here. Upon what logical basis does a clinician prescribe 
treatment? He first makes a number of implicit predictions of this 
kind: “If treatment A is used in this-and-this problem there is a high- 
er probability of improvement than if treatment B or C is used.” 
These predictions may be, and usually are, based on clinical experi- 
ence. Prediction of probable outcome of alternative conditions, wheth- 
er implied or stated overtly, is always the first step. Unfortunately, 
few critical studies have been made which would give us prediction 
tables for various types of treatment of social psychological problems. 
As an example of the type of study which can be used to predict 
efficacy of treatment, we can refer to one in the field of neuro-psychia- 
try. Dub and Lurie (5) report the beneficial effects of benzedrine on 
depressive patients in 75% of their cases. If these findings are verified, 
the psychiatrist, when faced with a patient presenting the depressive 
syndrome, will prescribe treatment on the basis of statistics for treat- 
ed and cured cases. In so doing, of course, he will be making an im- 
plicit prediction: the chances are n— m/n or 3/4 that this treat- 
ment will result in a change of behavior of psychiatric significance. 
Until studies of this general type have béen made for various clinical 
syndromes, psychologists will continue to rely on subjective and 
crude generalizations based on clinical experience. 

A critic may rise to this occasion and declare that statistics have 
not been developed adequately to deal with clinical material. This is 
fallacious. That many kinds of clinical data are crude and not adapted 
to refined treatment is obvious. It is needless to emphasize that at 
this point in clinical research,refined statistical analyses are usually 
unnecessary. Parker (9) has suggested that the Chi-square test be 
applied to statistics of treated and cured cases where pharmacological 
preparations have been used. This same procedure is well-adapted to 
many of the problems of clinical research and should be considered by 
those interested in evaluating clinical procedures. Analysis of vari- 
ance may also be utilized for certain problems. Once the hypothesis 
is formulated and the data collected, the appropriate statistical tool 
will not be difficult to discover. 

Because of efforts at condensation, I may have left the impression 
that investigations not immediately adapted to statistical treatment 
are not of value. This is not so. It is true that any scientific investiga- 
tion must deal with precise relations. But before the period of mathe- 
matical precision, there must first be a period of crude observation. 
and hypothesizing. The history of science shows many instances where 
scientists had to learn how to enumerate the objects of their inquiries 
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or to discover what sort of measurement it was most profitable to 
make. 

I believe the psychological clinic has been overlooked as a start- 
ing place for research investigations in social psychology. That the 
clinical psychologist is in a most enviable position for formulating 
hypotheses soon becomes apparent to anyone. Every case abounds in 
hypotheses which arise out of the clinician’s interbehaving with con- 
crete human beings in social psychological situations. By submitting 
these hypotheses to test, the clinician may make important contribu- 
tions not only to applied science, but to theoretical science as well. At 
the same time that he discovers a relationship between antecedent and 
consequent which is clinically useful, he makes a contribution to social 
psychological theory. 

To summarize: on the basis of a logical analysis of the two chief 
functions of clinical psychologists—diagnosis and treatment—it is 
submitted that clinical psychology is a scientific enterprise. Prediction, 
the sine qua non of science, is closely bound up with the concepts of 
diagnosis and treatment. Predictions are made from empirically- 
observed events. To observe and deal with these events it is unneces- 
sary to postulate artistic and intuitive concepts. It is submitted that 
the traditional method of science, logical inference, provides a pattern 
for clinical psychology as it does for the rest of the scientific world. 
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PORTABLE RESPONSE TIMER 


J. E. P. LIBBY AND A. L. HUNSICKER 
PSYCHOMETRIC LABORATORY, UNIVERSITY OF CHICAGO 


The instrument to be described was developed from apparatus 
the original design of which, by Landahl and Bechtoldt, was unpub- 
lished. Its function is to measure the times of simple reaction, choice, 
perception of projected patterns, and perception of dim patterns after 
light adaptation. It may also be used to time verbal response to 
multiple-choice material when combined with the response relay 
described by Libby elsewhere in the present journal. 

The complete instrument, including chronoscope, subject keys, 
and experimenter keys, may be carried in a case 11”x6.5"x6.5". The 
chronoscope, M , occupies the left half of the case. The upper half of 
the right-hand section serves for carrying the subject key (H) or, as 
shown in the accompanying cut, the microfilm container and projector 
slides. All remaining apparatus is in the lower half of this compart- 
ment. 

The use of three Microswitches, two of which are actuated by the 
same subject key, permits indication by the panel lights, K , not only 
when the right key, S; , or the left key, S. , is depressed but also when 
both are depressed. 

The relay serves both to prevent restarting of the chronoscope 
when the subject releases his hand from the key and, when used with 
the response relay, to permit the use of a light adaptation lamp of 500 
watts, an excessive load for the response relay proper. In this con- 
nection, the dual condenser, C, and C,, prevents damaging arcing. 

The “stimulus” outlet, marked Z, may supply any 110-volt appa- 
ratus, such as a light or a bell or, as in the present case, a lantern for 
projecting test material from microfilm. The tip-jack in the center of 
the panel is for voice connection. The two toggle switches, I , below it 
adapt the circuit for use with the voice key or with manual keys. 
Therefore, these three items may be eliminated where no verbal re- 
sponse is required. The dial light, L , was cut down from a “Snapit” 
lamp available in any five-and-ten-cent store and is removable for car- 
riage. 

The accompanying figure shows the back of one response timer 
(left) and the face of an identical unit (center) ; the subject keys 
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(H) may be placed at any convenient distance from the main unit. 
In operation the experimenter presses switch G, thereby throwing 
the relay and simultaneously starting the chronoscope and stimulus. 
The subject responds to the stimulus by pressing S, or S., lighting 
the appropriate pilot light, K, and opening the relay, thus stopping 
the chronoscope and the stimulus and turning on the panel lamp, L. 
When verbal response is desired the toggle switches, J, are reversed 
in position and the tip-jack connection is made with the Libby Re- 


sponse Relay. 


The cost of the material for this instrument is under $10.00 ex- 
clusive of a Standard Electric Time chronoscope. 
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S,—SPST 110 AC line switch 
Sy,» Sw,» Si—SPDT “Microswitches” 
(Subject’s Keys) 
Sy, S,,,—SPDT Toggle switch 
Sy,;—Pear Switch” 
(Experimenter’s Key) 
C,, C,—0.5 mfd. 400 volts working 


Dial and Panel Lights—110 volts, 6 
watts 

Adaptation Lamp—500 watts 

R—SPDT 110 AC relay 

Clutch and Motor—Refer to Standard 
Electric Time AC chronoscope 

Stimulus—See text 
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A—AC Line Plug H— Subject Key Unit 


B, C-—Outlet For Subject Key Unit I— Toggle Switches 
D—Experimenter Key Outlet J—Tip Jack 
E—Stimulus Outlet K— Panel Lights 
F—Adaptation Lamp Outlet L—Dial Lights 
G—Experimenter’s Key M— Chronoscope 


S— Line Switch 








See eee sere 
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ERRATA 


In the article “The Analysis of Variance and Covariance Tech- 
niques in Relation to the Conventional Formulas for the Standard 
Error of a Difference,” by Max D. Englehart, Psychometrika, 6, Au- 
gust, 1941, Page 221, the following typographical errors occurred. 

On page 224, the middle paragraph should begin, “Since the mean 
of a pair of scores equals (X, + X.) /2 and n for the pair equals 2, the 
sum of squares between the means of pairs, ~>(M,— M)?, be 
comes ---” 

In the formula for F at the top of page 227, no bar should appear 
over x in the expression for (>xy)?; and after the formula it should 
state that “xz and y are the group means --- ” 

In the last sentence at the bottom of the page “S\(a — x) (y — y) 
= pry.” 

At the top of page 229 “Sw? and S(a — x)? reduce for both 
groups to N,%,? + Nza?.” 
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